DepledgeLab / DRUMMER

DRUMMER: Detection of RNA modifications in nanopore direct RNA Sequencing datasets
GNU General Public License v3.0
21 stars 1 forks source link

Question regarding output folders #28

Open kwonej0617 opened 1 year ago

kwonej0617 commented 1 year ago

Hello, Thank you for developing a great tool. I have run DRUMMER with my data and obtained several output folders. I used the same command line for DRUMMERS for both sample 1 and sample 2. However, the output files and directories generated from the two data are different. Could you please check if one of the data has not been processed completely?

data/output/DRUMMER/sample1/com/w-k: #no gTEST and MERGED directories generated bam_readcount complete_analysis map m6A_plot.pdf summary.txt

data/output/DRUMMER/sample2/com/w-k: #_no m6Aplot.pdf generated, no files in MTOIF and ODDS directories bam_readcount complete_analysis gTEST map MERGED MOTIF ODDS summary.txt

The outputs are quite different between the two data, but the final summary.txt. files look successfully generated. Summary.txt from sample 1

transcript_id   reference_base  transcript_pos  depth_ctrl      depth_treat     ref_fraction_ctrl       ref_fraction_treat      frac_diff       odds_ratio      log2_(OR)       OR_padj eleven_bp_motif G_test  G_padj  accumulation    depletion       nearest_ac      nearest_ac_motif        homopolymer     is_SNP
E1A-10S A       519     244     355     0.783   0.966   0.183   7.932   -2.9875938919791065     9.26e-10        CGAGGACTTGC     52      1.02e-07                depletion       0       GGACT           
E1A-9S  C       327     1491    851     0.881   0.797   -0.084  0.531   0.9124045344268696      4.02e-05        GTGGTCCCGCT     29      3.61e-03        accumulation            10      ACACC   True    
E1A-9S  A       398     1484    856     0.757   0.923   0.165   3.834   -1.9387414998147603     4.22e-23        CGAGGACTTGC     113     8.33e-21                depletion       0       GGACT           
E1A-9S  A       428     1472    852     0.81    0.914   0.104   2.496   -1.3193979433156116     2.67e-09        TTTGGACTTGA     50      2.40e-07                depletion       0       GGACT   True 

Summary.txt for sample 2

transcript_id   reference_base  transcript_pos  depth_ctrl      depth_treat     ref_fraction_ctrl       ref_fraction_treat      frac_diff       odds_ratio      log2_(OR)       OR_padj eleven_bp_motif G_test  G_padj  accumulation    depletion       nearest_ac      nearest_ac_motif        homopolymer     is_SNP
ENST00000361567 C       364     715     855     0.745   0.839   0.093   1.774   -0.8271077693632564     1.02e-02        ATCTACTCATC     30      8.87e-03                depletion       1
       CTACT           
ENST00000361390 T       361     2523    3639    0.918   0.875   -0.043  0.629   0.6694548745489162      7.77e-05        CAGGGTGAGCA     29      6.52e-03        accumulation            -10
     AAACT   True    
ENST00000361453 G       563     2515    3488    0.998   0.983   -0.015  0.139   2.847005115114224       7.32e-06        CATAGGATGAA     46      3.01e-06        accumulation            6
       CCACA           
ENST00000361789 T       122     2880    3266    0.856   0.796   -0.059  0.66    0.5987066689993701      1.30e-06        CTGCCTGATCC     49      7.19e-07        accumulation            -14
     TCACC           
DepledgeLab commented 1 year ago

Curious. I'm not sure what has happened here. Would you be able to show the full commands used for both and also confirm that you are running both in transcriptome mode? Did you see any errors reported during the analysis?

kwonej0617 commented 1 year ago

Actually, I used the same command line for both samples. Also, none of the errors occurred. python DRUMMER.py -r ${ref} -l ${txt} -t ${ko_bam} -c ${wt_bam} -o sample1/com -m True -a isoform python DRUMMER.py -r ${ref} -l ${txt} -t ${ko_bam} -c ${wt_bam} -o sample2/com -m True -a isoform

Between those two cases, which one is the correct one?
data/output/DRUMMER/sample1/com/w-k: #no gTEST and MERGED directories generated bam_readcount complete_analysis map m6A_plot.pdf summary.txt

data/output/DRUMMER/sample2/com/w-k: #no m6A_plot.pdf generated, no files in MTOIF and ODDS directories bam_readcount complete_analysis gTEST map MERGED MOTIF ODDS summary.txt

Also, I have another question. I was wondering if you have deposited meRIP-seq data of adenovirus in supplementary data. Is it the one in Ad5 viral m6A peaks tab in supplementary data2?

Thank you for your help!