dqwang122 / EATLM

Code for 'On Pre-trained Language Models For Antibody'
Other
30 stars 5 forks source link

Data format #2

Open huachui opened 6 months ago

huachui commented 6 months ago

Hi, thanks for your work!

I'm exploring the 'antibody\utils\preprocess.py' script and would like to inquire about the expected format of the 'cell.jsonl' file processed in the function at line 104. If sharing the data is not possible, any insights into the format or processing steps would be greatly appreciated.

dqwang122 commented 6 months ago

Hi,

Thanks for your interest! I believe this file is actually ‘Bcell.germline.jsonl’ in the dataset we release on zenodo (link: https://zenodo.org/records/7340488#.Y3sf4uxBxhE). Sorry for the confusion.

On Dec 9, 2023, at 20:11, huachui @.***> wrote:

Hi, thanks for your work!

I'm exploring the 'antibody\utils\preprocess.py' script and would like to inquire about the expected format of the 'cell.jsonl' file processed in the function at line 104. If sharing the data is not possible, any insights into the format or processing steps would be greatly appreciated.

— Reply to this email directly, view it on GitHub https://github.com/dqwang122/EATLM/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGFKEGHGTVQD6VPWBPMB4VTYIRIONAVCNFSM6AAAAABANX73R2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZTGOBSGU2TIOA. You are receiving this because you are subscribed to this thread.

huachui commented 6 months ago

Thank you for your helpful response!