Kohulan / DECIMER-Image_Transformer

DECIMER Image Transformer is a deep-learning-based tool designed for automated recognition of chemical structure images. Leveraging transformer architectures, the model converts chemical images into SMILES strings, enabling the digitization of chemical data from scanned documents, literature, and patents.
MIT License
216 stars 52 forks source link

Add data to your pretrained model #27

Closed pythonnewbie3 closed 2 years ago

pythonnewbie3 commented 2 years ago

Is there a way to just add data to the model you already trained, without retraining it completely?

Kohulan commented 2 years ago

You do not have to retrain completely. You can start your training from the model weights provided.

pythonnewbie3 commented 2 years ago

Thank you. Do you have a documentation where this is explained?

Kohulan commented 2 years ago

No, we don't have it but you could refer to any Tensorflow tutorials regarding fine-tuning from previous checkpoints. I will try to add information regarding fine-tuning when I get time.

pythonnewbie3 commented 2 years ago

Thanks alot, do you have a tutorial on how to prepare the data to continue at the checkpoint?

Kohulan commented 2 years ago

No, we don't. But if you want to train the model on your own you have to generate TFRecord files out of your images. Using the following python files: https://github.com/Kohulan/DECIMER-Image_Transformer/tree/DECIMER_V1.0/TFRecord_Utils

pythonnewbie3 commented 2 years ago

Where can I find the modelweights?

Kohulan commented 2 years ago

You can find model weights here: https://zenodo.org/record/7180845 The record contains the latest DECIMER V1.0. checkpoints and DECIMER V2.0 models. It is already on the readme file please check it once.

pythonnewbie3 commented 2 years ago

I encountered problems when installing decimer on a different computer: 'urllib.error.HTTPError: HTTP Error 403: Forbidden' I think the fire wall is blocking this IP address. Where do I have to unzip them on the computer?

Kohulan commented 2 years ago

I am not sure which OS you are trying to install DECIMER. I would highly recommend using Linux-based operating systems. The models should be downloaded and unzipped into the user directory, for example: /Users/user_name/.data/DECIMER-V2

pythonnewbie3 commented 2 years ago

Thanks alot. :-)