nbrg-ppcu / prokbert

MIT License
10 stars 1 forks source link

Training data URL on zenodo #39

Open xvtyzn opened 9 months ago

xvtyzn commented 9 months ago

Dear proBERT developers

Thank you for publishing this excellent model. Also, great congratulations on the publication of the reviewed paper (https://www.frontiersin.org/articles/10.3389/fmicb.2023.1331233/full).

In the paper it says that for all training data, it has been published to zenodo (10.5281/zenodo.10057832). However, I cannot find that data.

I think that maybe the publication settings have not been changed. It would be great if you all could comment on this.

Best regards,

Keigo

obalasz commented 8 months ago

Dear Keigo, Thank you very much for your kind words and interest. Regarding the training data referenced in our paper and intended to be available on Zenodo: unfortunately, we've encountered technical difficulties in sharing large training datasets. To better assist you and ensure that we are addressing your needs as effectively as possible, could you please specify which dataset you are particularly interested in?

In the meantime, we're glad to share that alternative resources and datasets are available for you to explore and use, as mentioned in our previous communication:

Bacterial Promoters Dataset: [Hugging Face - Bacterial Promoters](https://huggingface.co/datasets/neuralbioinfo/bacterial_promoters)
ESAKPE Genomic Feature Dataset: [Hugging Face - ESKAPE Genomic Features](https://huggingface.co/datasets/neuralbioinfo/ESKAPE-genomic-features)
Phage Test Set (10k): [Hugging Face - Phage Test 10k](https://huggingface.co/datasets/neuralbioinfo/phage-test-10k)
Pretrained and Finetuned Models: [Hugging Face - Models](https://huggingface.co/datasets/neuralbioinfo/)
Example Notebooks: [GitHub - prokbert Examples](https://github.com/nbrg-ppcu/prokbert/tree/main/examples)

Should you have any further questions or need additional assistance in the meantime, please do not hesitate to reach out.

Best regards,

Balázs