a96123155 / UTR-LM

GNU General Public License v3.0
72 stars 14 forks source link

Data Processing Steps for 5' UTR Extraction #6

Open yuqinT10 opened 3 months ago

yuqinT10 commented 3 months ago

I am currently studying your work on the UTR-LM model, and I find your research highly insightful. I have a question regarding the data processing steps mentioned in Section A.1 of your paper in supporting information. Specifically, I noticed that the paper mentions the collection of 5' UTR sequences from the Ensembl database, followed by several cleaning and filtering steps. However, the details of how the sequences were extracted and processed from the Ensembl database to the final dataset used in your study are not fully elaborated. Could you please provide more detailed information on these steps? Thank you very much for your time and assistance. I look forward to your response.

截屏2024-08-26 17 37 38