instadeepai / InstaNovo

De novo peptide sequencing with InstaNovo: Accurate, database-free peptide identification for large scale proteomics experiments
Apache License 2.0
38 stars 7 forks source link

Illustration of pretrained model checkpoint #37

Open irleader opened 1 month ago

irleader commented 1 month ago

Hi,

Changelog for release 0.1.4 states: "add checkpoints instanovo.pt trained on HC-PT, and instanovo_yeast.pt fine-tuned on nine-species excluding yeast."

After checking the vocab/residues for both model checkpoints: instanovo.pt does not have 'N(+.98)', 'Q(+.98)', and uses 'C' and 'M(ox)' instanovo_yeast.pt has 'N(+.98)', 'Q(+.98)', and uses 'C(+57.02)' and 'M(+15.99)'.

Therefore, I doubt instanovo_yeast.pt can not be finetuned based on instanovo.pt as they have different vocabs/residues.

So is instanovo_yeast.pt trained from scratch on nine-species dataset excluding yeast?

Thanks!

irleader commented 1 month ago

Also, is it possible to share with us the instanovo.pt checkpoint trained on HC-PT and has 'N(+.98)', 'Q(+.98)', and uses 'C(+57.02)' and 'M(+15.99)'? Thanks a lot!