RitataLU / MethylC-analyzer

GNU General Public License v3.0
11 stars 4 forks source link

A question about running "python MethylC.py command met_list.txt TAR10.genes.gtf ~/wgbs/MethylC-analyzer/scripts/" #5

Open aoliaomiaomiao opened 1 year ago

aoliaomiaomiao commented 1 year ago

Hello,a few days ago I read a paper "MethylC-analyzer: a comprehensive downstream pipeline for the analysis of genome-wide DNA methylation" and I am trying to run the code "python MethylC.py command met_list.txt TAR10.genes.gtf ~/wgbs/MethylC-analyzer/scripts/ " , it occurred this following conditions:

########################################################################################## Now processing 1.1.cytosine_methylation_output.txt.CX_report.txt.CGmap.gz Now processing 1.2.cytosine_methylation_output.txt.CX_report.txt.CGmap.gz Now processing 1.3.cytosine_methylation_output.txt.CX_report.txt.CGmap.gz Now processing 2.1.cytosine_methylation_output.txt.CX_report.txt.CGmap.gz Now processing 2.2.cytosine_methylation_output.txt.CX_report.txt.CGmap.gz Now processing 2.3.cytosine_methylation_output.txt.CX_report.txt.CGmap.gz /usr/local/Miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/seaborn/algorithms.py:98: RuntimeWarning: Mean of empty slice boot_dist.append(f(*sample, **func_kwargs)) /usr/local/Miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1384: RuntimeWarning: All-NaN slice encountered return _nanquantile_unchecked( posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values

| Extract Regions from annotations | | snf Fall 2014 |

Building GTF dictionary... Processed 25000 lines... Processed 50000 lines... Dictionary built. Writing transcript properties. Processed 2500 entries... Processed 5000 entries... Processed 7500 entries... Processed 10000 entries... Processed 12500 entries... Processed 13348 entries. rm: cannot remove '/home/wgbs/MethylC-analyzer/scripts/gene_igr_bed6.bed': No such file or directory ###################################################################################### In this file /home/wgbs/MethylC-analyzer/scripts/, just _IGR_bed6.bed and _IGR_merge.bed exists not the gene_igr_bed6.bed .So what's the problem "rm: cannot remove '/home/wgbs/MethylC-analyzer/scripts/*gene_igr_bed6.bed': No such file or directory"? Happy Dragon Boat Fwstival!

beritlin commented 1 year ago

Hello aoliaomiaomiao,

The error may occur due to the non-consistent name, we will fix this issue in the next update. but this problem would not affect the output results, I believed. thanks for your feedback!

Pei-Yu

aoliaomiaomiao commented 1 year ago

But the code stop halfway, it doesn't export all the output files.

treartis commented 11 months ago

I also experienced the same issue. I fixed the remove error by changing the file name in the MethylC.py script (gene_igr_bed6.bed to IGR_bed6.bed), but the code didn't export all output files. Any suggestions?

obbedio commented 11 months ago

Hi! I have obtained the same error and, trying the solution adopted by @treartis , I obtained the same results, no output files... Could @beritlin @RitataLU or someone help us, please? Thanks a lot in advance :)

treartis commented 11 months ago

I learned that you must specify the type of analysis [command] you want to do. For example, if you want to generate a Heatmap and PCA, the command is: Heatmap_PCA

For example: python MethylC.py Heatmap_PCA met_list.txt TAR10.genes.gtf ~/wgbs/MethylC-analyzer/scripts/

Other commands are: Fold_Enrichment Metaplot DMR ChrView

This should help although the scripts may need some additional adjustments to fix any errors.

obbedio commented 11 months ago

Dear @treartis thanks a lot for you suggestions, tomorrow I will try! In the meantime, could you tell me if I can produce all plots with a singular command? And finally, all the available commands are those you wrote before? Thanks again :)

obbedio commented 11 months ago

I have tried to execute the command for HeatMap and PCA, but I obtained the following messages, with no output @treartis :

Now processing RP49H.txt.CGmap.gz Now processing RP49.txt.CGmap.gz no display found. Using non-interactive Agg backend /Users/user/miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, *kwargs) posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values ------------------------ |generating Heatmap $ PCA| ------------------------* Fatal error: non `e possibile aprire il file '/MethylC-analyzer/scripts/heatmap_PCA_all.R': No such file or directory

| Extract Regions from annotations | | snf Fall 2014 |

Building GTF dictionary... Processed 25000 lines... Processed 50000 lines... Processed 75000 lines... Processed 100000 lines... Processed 125000 lines... Processed 150000 lines... Processed 175000 lines... Processed 200000 lines... Processed 225000 lines... Processed 250000 lines... Processed 275000 lines... Processed 300000 lines... Processed 325000 lines... Processed 350000 lines... Processed 375000 lines... Processed 400000 lines... Processed 425000 lines... Processed 450000 lines... Processed 475000 lines... Processed 500000 lines... Processed 525000 lines... Processed 550000 lines... Processed 575000 lines... Processed 600000 lines... Processed 625000 lines... Processed 650000 lines... Processed 675000 lines... Processed 700000 lines... Processed 725000 lines... Processed 750000 lines... Processed 775000 lines... Processed 800000 lines... Processed 825000 lines... Processed 850000 lines... Processed 875000 lines... Processed 900000 lines... Processed 925000 lines... Processed 950000 lines... Processed 975000 lines... Processed 1000000 lines... Processed 1025000 lines... Processed 1050000 lines... Processed 1075000 lines... Processed 1100000 lines... Processed 1125000 lines... Processed 1150000 lines... Processed 1175000 lines... Processed 1200000 lines... Processed 1225000 lines... Processed 1250000 lines... Processed 1275000 lines... Processed 1300000 lines... Processed 1325000 lines... Processed 1350000 lines... Processed 1375000 lines... Processed 1400000 lines... Processed 1425000 lines... Processed 1450000 lines... Processed 1475000 lines... Processed 1500000 lines... Processed 1525000 lines... Processed 1550000 lines... Processed 1575000 lines... Processed 1600000 lines... Processed 1625000 lines... Processed 1650000 lines... Processed 1675000 lines... Processed 1700000 lines... Processed 1725000 lines... Processed 1750000 lines... Processed 1775000 lines... Processed 1800000 lines... Processed 1825000 lines... Processed 1850000 lines... Processed 1875000 lines... Processed 1900000 lines... Processed 1925000 lines... Processed 1950000 lines... Processed 1975000 lines... Processed 2000000 lines... Processed 2025000 lines... Processed 2050000 lines... Processed 2075000 lines... Processed 2100000 lines... Processed 2125000 lines... Processed 2150000 lines... Processed 2175000 lines... Processed 2200000 lines... Processed 2225000 lines... Processed 2250000 lines... Processed 2275000 lines... Processed 2300000 lines... Processed 2325000 lines... Processed 2350000 lines... Processed 2375000 lines... Processed 2400000 lines... Processed 2425000 lines... Processed 2450000 lines... Processed 2475000 lines... Processed 2500000 lines... Processed 2525000 lines... Processed 2550000 lines... Processed 2575000 lines... Processed 2600000 lines... Processed 2625000 lines... Processed 2650000 lines... Processed 2675000 lines... Processed 2700000 lines... Processed 2725000 lines... Processed 2750000 lines... Processed 2775000 lines... Processed 2800000 lines... Processed 2825000 lines... Processed 2850000 lines... Processed 2875000 lines... Processed 2900000 lines... Processed 2925000 lines... Dictionary built. Writing transcript properties. Processed 2500 entries... Processed 5000 entries... Processed 7500 entries... Processed 10000 entries... Processed 12500 entries... Processed 15000 entries... Processed 17500 entries... Processed 20000 entries... Processed 22500 entries... Processed 25000 entries... Processed 27500 entries... Processed 30000 entries... Processed 32500 entries... Processed 35000 entries... Processed 37500 entries... Processed 40000 entries... Processed 42500 entries... Processed 45000 entries... Processed 47500 entries... Processed 50000 entries... Processed 52500 entries... Processed 55000 entries... Processed 57500 entries... Processed 60000 entries... Processed 62500 entries... Processed 65000 entries... Processed 67500 entries... Processed 70000 entries... Processed 72500 entries... Processed 75000 entries... Processed 77500 entries... Processed 80000 entries... Processed 82500 entries... Processed 85000 entries... Processed 87500 entries... Processed 90000 entries... Processed 92500 entries... Processed 95000 entries... Processed 97500 entries... Processed 100000 entries... Processed 102500 entries... Processed 105000 entries... Processed 107500 entries... Processed 110000 entries... Processed 112500 entries... Processed 115000 entries... Processed 117500 entries... Processed 120000 entries... Processed 122500 entries... Processed 125000 entries... Processed 127500 entries... Processed 130000 entries... Processed 132500 entries... Processed 135000 entries... Processed 137500 entries... Processed 140000 entries... Processed 142500 entries... Processed 145000 entries... Processed 147500 entries... Processed 150000 entries... Processed 152500 entries... Processed 155000 entries... Processed 157500 entries... Processed 160000 entries... Processed 162500 entries... Processed 165000 entries... Processed 167500 entries... Processed 170000 entries... Processed 172500 entries... Processed 175000 entries... Processed 177500 entries... Processed 180000 entries... Processed 182500 entries... Processed 185000 entries... Processed 187500 entries... Processed 190000 entries... Processed 192500 entries... Processed 195000 entries... Processed 197500 entries... Processed 200000 entries... Processed 202500 entries... Processed 205000 entries... Processed 207500 entries... Processed 210000 entries... Processed 212500 entries... Processed 215000 entries... Processed 217500 entries... Processed 220000 entries... Processed 222500 entries... Processed 225000 entries... Processed 227500 entries... Processed 230000 entries... Processed 232500 entries... Processed 235000 entries... Processed 237500 entries... Processed 240000 entries... Processed 242500 entries... Processed 245000 entries... Processed 247500 entries... Processed 250000 entries... Processed 252500 entries... Processed 255000 entries... Processed 257500 entries... Processed 260000 entries... Processed 262500 entries... Processed 265000 entries... Processed 267500 entries... Processed 270000 entries... Processed 272500 entries... Processed 274031 entries.

treartis commented 11 months ago

Good question. I haven't tried running all commands in one line. I ran the full code on my own data but followed the same structure as above.

The error indicates that the script 'heatmap_PCA_all.R' isn't being found. You need to make sure it is located in the appropriate directory. I would specify the full path of the script file in the MethylC.py script anywhere it looks for that file. For example in this section:

heatmap_PCA

if(command=='Heatmap_PCA' or command=='all'): print ("------------------------") print ("|generating Heatmap $ PCA|") print ("------------------------") if context == 'CG': subprocess.call('''Rscript --slave /MethylC-analyzer/scripts/heatmap_PCA_all.R %s %s %s'''%(path_to_files + "CommonRegion_CG.txt",pca_heat_cut,path_to_files), shell=True) elif context == 'CHG': subprocess.call('''Rscript --slave /MethylC-analyzer/scripts/heatmap_PCA_all.R %s %s %s'''%(path_to_files +"CommonRegion_CHG.txt",pca_heat_cut,path_to_files), shell=True) elif context == 'CHH': subprocess.call('''Rscript --slave /MethylC-analyzer/scripts/heatmap_PCA_all.R %s %s %s'''%(path_to_files +"CommonRegion_CHH.txt",pca_heat_cut,path_to_files ), shell=True)

else: pass

obbedio commented 11 months ago

Dear @treartis thanks again, now I am trying e new analysis removing all full paths from all subprocesses (because I am executing the script within /MethylC-analyzer/scripts/ folder, which contains all my data). I hope it could be correct now, but on the contrary I will retry your solution. I will keep you updated :)

obbedio commented 11 months ago

No results... I replaced the full path of all script files in MethylC.py, this time obtaining the following:

(methylC_analyzer_env) luigidonato@ scripts % python MethylC.py all samples_list.txt hg38knownGene.gtf /Users/luigidonato/MethylC-analyzer/scripts/ Now processing RP49H.txt.CGmap.gz Now processing RP49.txt.CGmap.gz no display found. Using non-interactive Agg backend /Users/luigidonato/miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/seaborn/axisgrid.py:118: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, *kwargs) posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values posx and posy should be finite values ------------------------ |generating Heatmap $ PCA| ------------------------* Caricamento del pacchetto richiesto: viridisLite

Caricamento pacchetto: ‘gplots’

Il seguente oggetto è mascherato da ‘package:stats’:

lowess

Caricamento del pacchetto richiesto: grid

ComplexHeatmap version 2.16.0 Bioconductor page: http://bioconductor.org/packages/ComplexHeatmap/ Github page: https://github.com/jokergoo/ComplexHeatmap Documentation: http://jokergoo.github.io/ComplexHeatmap-reference

If you use it in published research, please cite either one:

The new InteractiveComplexHeatmap package can directly export static complex heatmaps into an interactive Shiny app with zero effort. Have a try!

This message can be suppressed by: suppressPackageStartupMessages(library(ComplexHeatmap))

[1] 100025 5 Errore in hclust(get_dist(submat, distance), method = method) : la dimensione non può essere NA e neppure eccedere 65536 Chiamate: draw ... make_row_cluster -> .local -> make_cluster -> hclust Esecuzione interrotta --------------- |Identifying DMR| --------------- Traceback (most recent call last): File "/Users/luigidonato/MethylC-analyzer/scripts/MethylC.py", line 508, in Find_DMR2(context,dmr_cut,testmethod) File "/Users/luigidonato/MethylC-analyzer/scripts/MethylC.py", line 163, in Find_DMR2 pKS = stats.kstest(expValue2,ctrlValue2)[1] File "/Users/luigidonato/miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/scipy/_lib/_util.py", line 713, in wrapper return fun(*args, *kwargs) File "/Users/luigidonato/miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/scipy/stats/_stats_py.py", line 9081, in kstest return ks_2samp(xvals, yvals, alternative=alternative, method=method) File "/Users/luigidonato/miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/scipy/_lib/_util.py", line 713, in wrapper return fun(args, **kwargs) File "/Users/luigidonato/miniconda3/envs/methylC_analyzer_env/lib/python3.9/site-packages/scipy/stats/_stats_py.py", line 8809, in ks_2samp raise ValueError('Data passed to ks_2samp must not be empty') ValueError: Data passed to ks_2samp must not be empty

This analysis seems to be very hard... @treartis

obbedio commented 11 months ago

Good morning, @treartis @RitataLU @aoliaomiaomiao, may I send you my input files to test on MethylC? I tried this analysis for about a month, but I still receive errors after errors... Please help me, I have no other way to test the script...