Closed brianloyal closed 1 year ago
FYI I created this dataset in HuggingFace for now. Please let me know if this causes any issues. Thanks again!
Hey Brian, just an FYI that sample dataset we used to have is really a snippet of what was trained for the model. Really, the code repo here was more for people to get a flavour of how they could train an antibody language model (which there are now tons of examples!). Thanks for uploading that to HF.
FYI I created this dataset in HuggingFace for now. Please let me know if this causes any issues. Thanks again!
@brianloyal Hi, I found your created dataset to be incredibly valuable. However, as a beginner in this field, I'm curious about the preprocessing steps you undertook on the data sourced from the OAS database. Could you please share more details about it?
Hey team, great work on this project. I noticed that the pre-training data snippets were removed from the repository back in April. Are they available as a HF dataset? Or available someplace else (besides the git history)?