GoekeLab / m6anet

Detection of m6A from direct RNA-Seq data
https://m6anet.readthedocs.io/
MIT License
103 stars 19 forks source link

m6anet-dataprep error #27

Closed baibing7713661 closed 2 years ago

baibing7713661 commented 2 years ago

image

m6anet-dataprep --eventalign /binf-isilon/PBgrp/jfb841/nanopore/data_extract/20180227_1832_20180227_FAH59351_vir1_2922_DRS/basecalled_data/reads-ref.eventalign.vir1_1.txt --out_dir /binf-isilon/PBgrp/jfb841/nanopore/data_extract/m6anet_output.vir1_1 --n_processes 20

Hi, I performed m6anet-dataprep with the script above and the run will be stuck there. I am wondering what could be the possibility leading to this and what is the "AssertionError" mean? Thanks.

Bing

chrishendra93 commented 2 years ago

Hi @baibing7713661 , the assertion is there to make sure that at each position only 1 kind of 5-mer motif is present. Did you run this on genome mapped reads? Alternatively you can map the reads to transcriptome and run eventalign on the transcriptome level as well

baibing7713661 commented 2 years ago

Hi Chris, this is mapped on the transcriptome. I did not map on the genome since I want to map the m6A on mRNA. Please let me know what I can do, I have tested different samples and seems all get stuck here.

chrishendra93 commented 2 years ago

are these arabidopsis samples taken from parker et al? I have actually tried running m6anet on those samples. Can you show me the first few rows of your nanopolish eventalign.txt? Did you run nanopolish with the option --scale-events and --signal_index?

baibing7713661 commented 2 years ago

Yes, they are from Paker et al. I want to see if all m6a signal can be detected without using knock-out control. Please see the eventaign.txt headlines below. I used the --scale-events but not signal_index (see below the script to generate eventalign.txt).

nanopolish eventalign \ --reads /binf-isilon/PBgrp/jfb841/nanopore/data_extract/20180301_1825_20180301_FAH54216_VIRcomp_2928_DRS/basecalled_data/virComp_1.fastq \ --bam reads-ref.sorted.virComp_1.bam \ --genome /binf-isilon/PBgrp/jfb841/ECT2_HyperTribe_Germination/X204SC21080124_Z01_F001/raw_data/data_merged/ECT2_Germination/hyperTRIBE_carsten/RNAeditR_extra-master/annotations/Araport11_201606.fasta \ --scale-events > reads-ref.eventalignvirComp_1.txt

image

chrishendra93 commented 2 years ago

ah I see, that's the reason for the error. m6Anet requires --signal_index parameter when you run nanopolish because it will then tell m6anet-dataprep the signal length of each event which is then used to compute weighted average of the features per interval. I think you need to rerun nanopolish with the --scale-events and --signal_index parameter. I will make sure to include additional check on the nanopolish eventalign output in the next version to check for the presence of these columns

baibing7713661 commented 2 years ago

I used --scale-events and --signal_index and re-run the eventalign with output below. However this assertionError still keeps and it stucks. Please let me know anything else could be still tried? Also good to know since you have tried Paker's data if m6A could be mapped w/o control samples.

image image

chrishendra93 commented 2 years ago

Do you mind sending me the eventalign.txt file? I am guessing this is from the smaller replicate? Based on my experience it can still capture m6a sites without the control

baibing7713661 commented 2 years ago

Sure, can you please give me your email for sending the files? And did you compare with and without control to see the difference?

chrishendra93 commented 2 years ago

Chrishendra93@gmail.com. I did run the predictions on the control as well and they are different, at least fewer sites detected on the control and these sites do not have significant p value in the g test tables in parker et al

baibing7713661 commented 2 years ago

Dropbox link sent. please let me know what comes out.

chrishendra93 commented 2 years ago

Yeah I received it, but I do not have enough space in my dropbox to accept it. Do you think you can share a google drive link instead?

baibing7713661 commented 2 years ago

Yes, google drive link sent.

chrishendra93 commented 2 years ago

hi @baibing7713661, I managed to run m6anet-dataprep on the data that you sent me. I think you can try re-running it again, making sure that the eventalign.txt is the correct one. Otherwise if it is still not working for you can also clone the branch release_1.2 (I have not fully tested it, but it should be working), run python setup.py install and run m6anet-dataprep from there

baibing7713661 commented 2 years ago

Hi Chris, can you paste your script used for m6anet-dataprep for me? I tried both versions and seems still stuck at the same step. I want to see if there is something different in the script.

baibing7713661 commented 2 years ago

image image Same as before, it could generate some output files but gets stuck in the same step.

baibing7713661 commented 2 years ago

also, how can I check the current version used to run is 1.2 rather then 1.0 since I updated the m6anet_1.2 in different directories (/binf-isilon/PBgrp/jfb841/software_scratch/programs/m6anet_1.2/m6anet) and update in "source ~/.bashrc". But I am not sure the current running version is 1.2 since it still uses dataprep.py under python3.8/site-packages/scripts/ seen from last line of script.

chrishendra93 commented 2 years ago

the command that I used was m6anet-dataprep --eventalign reads-ref.eventalign.vir1_1.txt --output_dir . --n_processes 40. I will recommend that you create a separate conda environment / virtual environment before running python setup.py install? It might be that the installation clashes?

baibing7713661 commented 2 years ago

Hi Chris, I manage to run after update to 1.2. I forward you the output by email. Can you have a look?

Cheers, Bing

baibing7713661 commented 2 years ago

I just realize the other method you have co-authored https://github.com/GoekeLab/xpore for direct RNA modification. Any difference from m6anet for m6A detection? Did you benchmark the two methods to see which one works better?

Cheers, Bing