GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
104 stars 19 forks source link

[Request]: eventalign.tsv.gz support #95

Closed loganylchen closed 1 year ago

loganylchen commented 1 year ago

Hi there,

This is a very nice tool. May I ask if the m6anet could support compressed eventalign results? The size of plain eventalign is usually very space-consumed.

Best

chrishendra93 commented 1 year ago

hi @loganylchen, this crossed my mind sometime ago actually but didn't research enough to implement this. Currently m6Anet works with the uncompressed eventalign results because plain text file is easy to index with Python but maybe there is a way to index and read the compressed results. Will look into this for the next feature release as we too have problems storing nanopolish eventalign results. Feel free to propose anything if you have any idea on how should we proceed with implementing this feature!

loganylchen commented 1 year ago

I read parts of the source codes, I think it may adopt the same function in XPORE. I noticed you used pandas to access the plain data and also built an index for the data. I know the pandas could read the compressed gzip file, but not sure if the chunk size could be the same between uncompressed and compressed eventalign results.

I am not sure if you could take a survey in this direction.

Best