GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
104 stars 19 forks source link

Identification of m6A with the SG-NEx samples #139

Closed VikArz02 closed 11 months ago

VikArz02 commented 11 months ago

Hello! I want to explore m6Anet using the SG-NEx dataset with preprocessed data for m6Anet. However, it appears that this dataset lacks the data.info component. Could you please guide how I can effectively utilize this dataset to experiment with m6Anet? Thank you for your assistance.

VikArz02 commented 11 months ago

And if i rename readcount to info, i had the error with index: KeyError: "['start', 'end'] not in index"

kristinrma commented 11 months ago

Hi @VikArz02,

Apologies for the outdated m6Anet preprocessed files; we will be updating them on the next SG-NEx release. For now you can use this script that appends the 'n_reads' column of data.readcount to data.index to create data.info. It will contain the same information as if you generated data.info from m6Anet dataprep.

import pandas as pd

def create_data_info(data_index, data_readcount, data_info): index_df = pd.read_csv(data_index) readcount_df = pd.read_csv(data_readcount) index_df = index_df.join(readcount_df['n_reads'], how='right') index_df.to_csv(data_info, index=False)

create_data_info('data.index', 'data.readcount', 'data.info')

VikArz02 commented 11 months ago

I found way from command line that you recommended in your release 2.0.0: m6anet convert. Thank you for the fast answer!

kristinrma commented 11 months ago

No problem!