jlab-code / MethylStar

A fast and robust pre-processing pipeline for bulk or single-cell whole-genome bisulfite sequencing (WGBS) data.
GNU General Public License v3.0
30 stars 6 forks source link

Running last step - Methimpute troobleshooting #2

Closed El-Castor closed 4 years ago

El-Castor commented 4 years ago

Hi @shahryary,

Thanks for your last response that I didn't response before the closing.

I have continue the analysis using your pipeline. When I launch the Methimpute I have an error message directly after the launch ( see bellow the error message from the Shell console).

See below the error message :

---------------------------------------------------------------------------

    *** Running Methimpute Part ***

Configuration Summary:

- Intermediate: Enabled
- Fit reports: Enabled
- Enrichment reports: Enabled
- Full reports: Enabled

================================================================================

Running Methimpute Part...
Found reference chromosome file.

Error in contrib.url(repos, type) : 
  trying to use CRAN without setting a mirror
Calls: source ... eval -> req_pkg -> install.packages -> grep -> contrib.url
Exécution arrêtée
sort: impossible de lire: /NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/methimpute-out/file-processed.lst: Aucun fichier ou dossier de ce type
rm: impossible de supprimer '/NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/methimpute-out/file-processed.lst': Aucun fichier ou dossier de ce type
(535, '5.7.8 Username and Password not accepted. Learn more at\n5.7.8  https://support.google.com/mail/?p=BadCredentials o18sm8299173wme.19 - gsmtp')
Something went wrong...
something is going wrong... please run again. 

Please, press ENTER to continue ...

Do you have any idea ?

Thanks in advance,

shahryary commented 4 years ago

Hi @El-Castor,

Thanks for the feedback, this error seems to me that pipeline can not be install R package in your system, could you please install some R libs and try again? something like these packages: list.of.packages = c("DMRcaller","GenomicRanges","devtools","annotatr","GenomicFeatures","data.table","dplyr","ggplot2","doParallel","stringr","Rhtslib","methylKit")

El-Castor commented 4 years ago

Hi @shahryary, The trooble is that the fonction install.package() doesn't has the CRAN mirror set up so it doesn't found were to download the library. So to bypass this issues without custom the code I install all library by hand.

But now I have another issues, I have some conflics with Rhtslib and methylKit, as you can see below:

installing *source* package ‘Rhtslib’ ...
** libs
cd "htslib-1.7" && make -f "/opt/share/FLOCAD/userspace/cpichot/miniconda3/envs/MethylStar/lib/R/etc/Makeconf" -f "Makefile.Rhtslib"
make[1]: Entering directory '/tmp/RtmpRZjeZN/R.INSTALL8ca35e4757d8/Rhtslib/src/htslib-1.7'
Makefile.Rhtslib:128: warning: overriding recipe for target '.c.o'
/opt/share/FLOCAD/userspace/cpichot/miniconda3/envs/MethylStar/lib/R/etc/Makeconf:160: warning: ignoring old recipe for target '.c.o'
x86_64-conda_cos6-linux-gnu-cc -g -Wall -O2 -I.  -c -o kfunc.o kfunc.c
x86_64-conda_cos6-linux-gnu-cc -g -Wall -O2 -I.  -c -o knetfile.o knetfile.c
x86_64-conda_cos6-linux-gnu-cc -g -Wall -O2 -I.  -c -o kstring.o kstring.c
x86_64-conda_cos6-linux-gnu-cc -g -Wall -O2 -I.  -c -o bcf_sr_sort.o bcf_sr_sort.c
x86_64-conda_cos6-linux-gnu-cc -g -Wall -O2 -I.  -c -o bgzf.o bgzf.c
In file included from bgzf.c:39:0:
htslib/bgzf.h:35:10: fatal error: zlib.h: No such file or directory
 #include <zlib.h>
          ^~~~~~~~
compilation terminated.
make[1]: *** [Makefile.Rhtslib:128: bgzf.o] Error 1
make[1]: Leaving directory '/tmp/RtmpRZjeZN/R.INSTALL8ca35e4757d8/Rhtslib/src/htslib-1.7'
make: *** [Makevars.common:23: htslib] Error 2
ERROR: compilation failed for package ‘Rhtslib’
* removing ‘/opt/share/FLOCAD/userspace/cpichot/miniconda3/envs/MethylStar/lib/R/library/Rhtslib’

The downloaded source packages are in
    ‘/tmp/RtmpaeYfux/downloaded_packages’
Updating HTML index of packages in '.Library'
Making 'packages.html' ... done
Warning message:
In install.packages(...) :
  installation of package ‘Rhtslib’ had non-zero exit status

To bypass this, I use conda to install this package, take in the bioconda channels. I don't know yet if this can resolve the conflics. I will give you more information later.

El-Castor commented 4 years ago

H @shahryary,

I have resolved the conflict issue related to the methimpute install. When I run methimpute I have an error but I don't understand what is wrong. See below the shell error message :

[1] "It's first time you are running Methimpute for this data-set!"
Scanning for ambiguous nucleotides ... 34.38s
Extracting cytosines from forward strand ... 62.71s
Extracting cytosines from reverse strand ... 74.49s
Merging ... 14.15s
Shifting by anchor ... 24.88s
Sorting ... 19.6s
[1] "Running...../cx-reports/Mock_FDLM202341331-1a_H3V2NDSXY_L4.CX_report.txt"
Reading file ../cx-reports/Mock_FDLM202341331-1a_H3V2NDSXY_L4.CX_report.txt ... 92.67s
Inflating methylome ... 12.5s
Adding distance ... 6.55s
Adding transition context ... 40.14s
Calculating correlations
  for distance 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
Finished calculating correlations in 66.69s
Adding distance ... 6.26s
Adding transition context ... 39.03s
Baum-Welch: Fitting HMM parameters
 Iteration              log(P)             dlog(P)    Time in sec
         0                -inf                   -              0
         1            0.000000                 inf             54
         2           -0.000000           -0.000000            141
HMM: Convergence reached!
Time spent in Baum-Welch: 152.43s
Compiling results ... 97.99s
[1] "Decreasing data-set size..."
./src/bash/methimpute.sh : ligne 9 : 14266 Processus arrêté      Rscript ./src/bash/methimpute.R $result_pipeline $genome_ref $genome_name $tmp_rdata $intermediate $fit_output $enrichment_plot $full_report $context_report $intermediate_mode --no-save --no-restore --verbose
sort: impossible de lire: /NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/methimpute-out/file-processed.lst: Aucun fichier ou dossier de ce type
./src/bash/methimpute.sh: ligne 12 : [: trop d'arguments
(535, '5.7.8 Username and Password not accepted. Learn more at\n5.7.8  https://support.google.com/mail/?p=BadCredentials u13sm23640596wrp.53 - gsmtp')
Something went wrong...
something is going wrong... please run again. 
shahryary commented 4 years ago

Hi @El-Castor, I think the problem comes from cx_reports files, could you please share just one file and also from result folder "Ref_Chr.RData". (You can share in Dropbox or Gdrive)

Thanks.

El-Castor commented 4 years ago

Hi @shahryary ,

Thanks to respond quickely. The cx_report are very big (4.5Go), It is ok if I send a screen of the head and the tail of the file ? Or you must to have the entire file ?

shahryary commented 4 years ago

@El-Castor, if you send just 50 rows from head/tail of file it's fine, Also just few records of .txt file in rdata folder.

El-Castor commented 4 years ago

@shahryary I sen you the Ref_Chr.RData via DropBox. Tell me if you not recieve it.

Concerning the cx_reports output file, below you have the 50 first raw :

(MethylStar) cpichot@node15:/NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/cx-reports$ head -50 Mock_FDLM202341331-1a_H3V53DSXY_L2.CX_report.txt
CMiso1.1chr03   1   +   0   0   CHH CCC
CMiso1.1chr03   2   +   0   0   CHH CCA
CMiso1.1chr03   3   +   0   0   CHH CAA
CMiso1.1chr03   7   +   0   0   CHH CCC
CMiso1.1chr03   8   +   0   0   CHH CCT
CMiso1.1chr03   9   +   0   0   CHH CTA
CMiso1.1chr03   14  +   0   0   CHH CCC
CMiso1.1chr03   15  +   0   0   CHG CCG
CMiso1.1chr03   16  +   0   0   CG  CGA
CMiso1.1chr03   17  -   0   0   CG  CGG
CMiso1.1chr03   21  +   0   0   CHH CCC
CMiso1.1chr03   22  +   0   0   CHH CCT
CMiso1.1chr03   23  +   0   0   CHH CTA
CMiso1.1chr03   28  +   0   0   CHH CCC
CMiso1.1chr03   29  +   0   0   CHG CCG
CMiso1.1chr03   30  +   0   0   CG  CGA
CMiso1.1chr03   31  -   0   0   CG  CGG
CMiso1.1chr03   35  +   0   0   CHH CCC
CMiso1.1chr03   36  +   0   0   CHH CCT
CMiso1.1chr03   37  +   0   0   CHH CTA
CMiso1.1chr03   42  +   0   0   CHH CCC
CMiso1.1chr03   43  +   0   0   CHH CCC
CMiso1.1chr03   44  +   0   0   CHH CCA
CMiso1.1chr03   45  +   0   0   CHH CAA
CMiso1.1chr03   49  +   0   0   CHH CCC
CMiso1.1chr03   50  +   0   0   CHH CCT
CMiso1.1chr03   51  +   0   0   CHH CTA
CMiso1.1chr03   56  +   0   0   CHH CCC
CMiso1.1chr03   57  +   0   0   CHH CCC
CMiso1.1chr03   58  +   0   0   CHH CCA
CMiso1.1chr03   59  +   0   0   CHH CAA
CMiso1.1chr03   63  +   0   0   CHH CCC
CMiso1.1chr03   64  +   0   0   CHH CCT
CMiso1.1chr03   65  +   0   0   CHH CTA
CMiso1.1chr03   70  +   0   0   CG  CGC
CMiso1.1chr03   71  -   0   0   CG  CGT
CMiso1.1chr03   72  +   0   0   CHH CTA
CMiso1.1chr03   77  +   0   1   CHH CCC
CMiso1.1chr03   78  +   1   0   CHG CCG
CMiso1.1chr03   79  +   1   0   CG  CGA
CMiso1.1chr03   80  -   0   0   CG  CGG
CMiso1.1chr03   83  +   0   1   CHH CCC
CMiso1.1chr03   84  +   0   1   CHH CCC
CMiso1.1chr03   85  +   1   0   CHG CCG
CMiso1.1chr03   86  +   1   0   CG  CGA
CMiso1.1chr03   87  -   0   0   CG  CGG
CMiso1.1chr03   90  +   0   1   CHH CCC
CMiso1.1chr03   91  +   0   1   CHH CCC
CMiso1.1chr03   92  +   1   0   CHG CCG
CMiso1.1chr03   93  +   1   0   CG  CGA
shahryary commented 4 years ago

@El-Castor Your cx-report and ref_chr.txt files looks fine, problem is "methimpute" package that can't process the file to calculate HMM. if you could send just 1 sample I can start to process your file to find the problem.

El-Castor commented 4 years ago

When you mean sample, you talk about the bam file?
Keep in mind that I not used A.thaliana or human genome, I use a custom genome, I don't if you need it for this part?But the fact is it's in instance to publish so I haven't the right to send it. I hope that you will understand.

More over I have check the library R that I have to install. I see that I cannot install "Rhtslib" beacause of zlib library. It is install that why I don't understand the probleme.

see bellow the error :

(MethylStar) cpichot@node15:/NetScratch/cpichot/WGBS_analysis/library$ wget https://bioconductor.org/packages/release/bioc/src/contrib/Rhtslib_1.20.0.tar.gz
Will not apply HSTS. The HSTS database must be a regular and non-world-writable file.
ERROR: could not open HSTS store at '/ips2/users/cpichot/cluster/.wget-hsts'. HSTS will be disabled.
--2020-06-15 18:35:58--  https://bioconductor.org/packages/release/bioc/src/contrib/Rhtslib_1.20.0.tar.gz
Résolution de bioconductor.org (bioconductor.org)… 13.225.25.77, 13.225.25.109, 13.225.25.65, ...
Connexion à bioconductor.org (bioconductor.org)|13.225.25.77|:443… connecté.
requête HTTP transmise, en attente de la réponse… 200 OK
Taille : 1471582 (1,4M) [application/x-gzip]
Sauvegarde en : « Rhtslib_1.20.0.tar.gz »Rhtslib_1.20.0.tar.gz      100%[========================================>]   1,40M  3,02MB/s    ds 0,5s    2020-06-15 18:35:59 (3,02 MB/s) — « Rhtslib_1.20.0.tar.gz » sauvegardé [1471582/1471582](MethylStar) cpichot@node15:/NetScratch/cpichot/WGBS_analysis/library$ ls
Rhtslib_1.20.0.tar.gz
(MethylStar) cpichot@node15:/NetScratch/cpichot/WGBS_analysis/library$ tar -czvf Rhtslib_1.20.0.tar.gz ./Rhtslib_1.20.0
tar: ./Rhtslib_1.20.0 : stat impossible: Aucun fichier ou dossier de ce type
tar: Arrêt avec code d'échec à cause des erreurs précédentes
(MethylStar) cpichot@node15:/NetScratch/cpichot/WGBS_analysis/library$
El-Castor commented 4 years ago

Hi,

I have more information, when I launch Methimpute I see that it does'nt find the cx_report ... that is a little strange :


Configuration Summary:

- Intermediate: Enabled
- Fit reports: Enabled
- Enrichment reports: Enabled
- Full reports: Enabled
Couldn't find any CX file starting to run the Methimpute .. 
================================================================================

Running Methimpute Part...
Found reference chromosome file.

Le chargement a nécessité le package : devtools
Le chargement a nécessité le package : usethis
Le chargement a nécessité le package : data.table
Le chargement a nécessité le package : dplyr

Attachement du package : ‘dplyr’

The following objects are masked from ‘package:data.table’:

    between, first, last

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Le chargement a nécessité le package : ggplot2
Le chargement a nécessité le package : doParallel
Le chargement a nécessité le package : foreach
Le chargement a nécessité le package : iterators
Le chargement a nécessité le package : parallel
Le chargement a nécessité le package : stringr
Le chargement a nécessité le package : methimpute
Le chargement a nécessité le package : GenomicRanges
Le chargement a nécessité le package : stats4
Le chargement a nécessité le package : BiocGenerics

Attachement du package : ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:dplyr’:

    combine, intersect, setdiff, union

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
    colnames, colSums, dirname, do.call, duplicated, eval, evalq,
    Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
    lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames,
    rowSums, sapply, setdiff, sort, table, tapply, union, unique,
    unsplit, which, which.max, which.min

Le chargement a nécessité le package : S4Vectors

Attachement du package : ‘S4Vectors’

The following objects are masked from ‘package:dplyr’:

    first, rename

The following objects are masked from ‘package:data.table’:

    first, second

The following object is masked from ‘package:base’:

    expand.grid

Le chargement a nécessité le package : IRanges

Attachement du package : ‘IRanges’

The following objects are masked from ‘package:dplyr’:

    collapse, desc, slice

The following object is masked from ‘package:data.table’:

    shift

Le chargement a nécessité le package : GenomeInfoDb
character(0)
[1] "It's first time you are running Methimpute for this data-set!"
Scanning for ambiguous nucleotides ...
El-Castor commented 4 years ago

I have retry methimpute after the reinstall of my R env.

This case, I have differents output error, as you can see below :

Le chargement a nécessité le package : GenomeInfoDb
character(0)
[1] "It's first time you are running Methimpute for this data-set!"
Scanning for ambiguous nucleotides ... 33.89s
Extracting cytosines from forward strand ... 63.04s
Extracting cytosines from reverse strand ... 72.96s
Merging ... 13.92s
Shifting by anchor ... 24.86s
Sorting ... 18.87s
[1] "Running...../cx-reports/Mock_FDLM202341331-1a_H3V2NDSXY_L4.CX_report.txt"
Reading file ../cx-reports/Mock_FDLM202341331-1a_H3V2NDSXY_L4.CX_report.txt ... 91.74s
Inflating methylome ... 12.16s
Adding distance ... 6.26s
Adding transition context ... 41.29s
Calculating correlations
  for distance 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
Finished calculating correlations in 66.51s
Adding distance ... 6.51s
Adding transition context ... 41.97s
Baum-Welch: Fitting HMM parameters
 Iteration              log(P)             dlog(P)    Time in sec
         0                -inf                   -              0
         1            0.000000                 inf             55
HMM: Error in Baum-Welch: nan detected
Time spent in Baum-Welch: 99.28s
Compiling results ... 0s
ERROR : undefined columns selected 
[1] "Running...../cx-reports/Mock_FDLM202341331-1a_H3V53DSXY_L2.CX_report.txt"
Reading file ../cx-reports/Mock_FDLM202341331-1a_H3V53DSXY_L2.CX_report.txt ... 97.29s
Inflating methylome ... 15.58s
Adding distance ... 8.16s
Adding transition context ... 44.22s
Calculating correlations
  for distance 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
Finished calculating correlations in 76.11s
Adding distance ... 7.47s
Adding transition context ... 42.02s
Baum-Welch: Fitting HMM parameters
 Iteration              log(P)             dlog(P)    Time in sec
         0                -inf                   -              0
         1            0.000000                 inf             55
HMM: Error in Baum-Welch: nan detected
Time spent in Baum-Welch: 101.21s
Compiling results ... 0s
ERROR : undefined columns selected 
[1] "Running...../cx-reports/R150mM_FDLM202341332-1a_H3V2NDSXY_L4.CX_report.txt"
Reading file ../cx-reports/R150mM_FDLM202341332-1a_H3V2NDSXY_L4.CX_report.txt ... 113.45s
Inflating methylome ... 12.25s
Adding distance ... 6.25s
Adding transition context ... 41.11s
Calculating correlations
  for distance 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
Finished calculating correlations in 66.81s
Adding distance ... 6.25s
Adding transition context ... 41.13s
Baum-Welch: Fitting HMM parameters
 Iteration              log(P)             dlog(P)    Time in sec
         0                -inf                   -              0
         1            0.000000                 inf             57
HMM: Error in Baum-Welch: nan detected
Time spent in Baum-Welch: 101.53s
Compiling results ... 0s
ERROR : undefined columns selected 
[1] "Running...../cx-reports/R150mM_FDLM202341332-1a_H3V53DSXY_L2.CX_report.txt"
Reading file ../cx-reports/R150mM_FDLM202341332-1a_H3V53DSXY_L2.CX_report.txt ..../src/bash/methimpute.sh : ligne 9 : 19714 Processus arrêté      Rscript ./src/bash/methimpute.R $result_pipeline $genome_ref $genome_name $tmp_rdata $intermediate $fit_output $enrichment_plot $full_report $context_report $intermediate_mode --no-save --no-restore --verbose
sort: impossible de lire: /NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/methimpute-out/file-processed.lst: Aucun fichier ou dossier de ce type
./src/bash/methimpute.sh: ligne 12 : [: trop d'arguments
(535, '5.7.8 Username and Password not accepted. Learn more at\n5.7.8  https://support.google.com/mail/?p=BadCredentials z6sm25076199wrh.79 - gsmtp')
Something went wrong...
something is going wrong... please run again. 

Please, press ENTER to continue ...
shahryary commented 4 years ago

Hi @El-Castor ,

Thanks for the feedback, it's completely understandable about the sharing cx-report. So, to debug this problem I shared R file (it's methimpute script) with you, please modified lines 15-20 and also load the libraries. BUT to debug just please run line-by-line to see what is the result that you have after each line. from my prospective the problem is coming when starting to read the cx-file, please just run and let me know in which part you got error.

El-Castor commented 4 years ago

Hi @shahryary ,

Thanks a lot for the Metimpute custom script for the test. I will try today and I come back to you. Thanks you very much !

It's possible to see the methimpute bash command that launch the Rscript to have the given arg in a log file or something ? I have some difficulties to find the fifth to the last variable that you give to methimpute , as you can see below:

wd=(args[1])          # result direcory wd=("/NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out")
setwd(paste0(wd,"/","methimpute-out"))
genome_ref=(args[2])  # genome reference directory genome_ref=("/NetScratch/cpichot/references")
name_genome=(args[3]) # name of genome name_genome=("Others")
rdata=(args[4])       # file for genes/TEs/etc.(annotation files) rdata=("/NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/rdata")
intermediate<-as.logical(toupper((args[5])))
fit_output=as.logical(toupper((args[6])))
enrichment_plot=as.logical(toupper((args[7])))
full_report=as.logical(toupper((args[8])))
context_report=(args[9])
intermediate_mode=(args[10])
El-Castor commented 4 years ago

Hi @shahryary ,

So I have launch each row of your code and I have found were are the issue. See the following error message in R console :

First of all, I have two warning but I don't know If it influence a lot the rest of the analysis :

line 104:
> methylome <- inflateMethylome(bismark.data, cytosine.positions)
Inflating methylome ... 10.43s
Warning message:
In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)
line 106: 
> fit <- estimateTransDist(distcor)
Warning message:
In max(df$correlation, na.rm = TRUE) :
  aucun argument pour max ; -Inf est renvoyé
line 108:
> if (intermediate==TRUE){
+   model <- callMethylation(data = methylome, transDist = fit$transDist, include.intermediate=intermediate , update=intermediate_mode)
+   }else{
+   model <- callMethylation(data = methylome, transDist = fit$transDist, include.intermediate=intermediate)    
+ }
Adding distance ... 4.64s
Adding transition context ... 28.87s
Baum-Welch: Fitting HMM parameters
 Iteration              log(P)             dlog(P)    Time in sec
         0                -inf                   -              0
         1            0.000000                 inf             36
HMM: Error in Baum-Welch: nan detected
Time spent in Baum-Welch: 68.57s
Compiling results ... 0s
Warning message:
In callMethylation(data = methylome, transDist = fit$transDist,  :
  Baum-Welch aborted: nan detected

Second, here you have the error message during the methylation calling :

line 114:
> modifiedexportMethylome(model, filename = paste0("methylome_", name, ".txt"),going_file)
Error in `[.data.frame`(df, , c("seqnames", "start", "strand", "context",  : 
  undefined columns selected

here you have an head on the model create by methimpute :

head(model)
$data
GRanges object with 121817216 ranges and 4 metadata columns:
                                seqnames    ranges strand |  context   counts
                                   <Rle> <IRanges>  <Rle> | <factor> <matrix>
          [1]  CMiso1.1chr00 len=3219671         3      + |      CHH      0:0
          [2]  CMiso1.1chr00 len=3219671         6      - |      CHH      0:0
          [3]  CMiso1.1chr00 len=3219671        10      - |      CHH      0:0
          [4]  CMiso1.1chr00 len=3219671        17      - |      CHH      0:0
          [5]  CMiso1.1chr00 len=3219671        18      - |      CHH      0:0
          ...                        ...       ...    ... .      ...      ...
  [121817212] CMiso1.1chr12 len=26620111  26620101      - |      CHH      0:0
  [121817213] CMiso1.1chr12 len=26620111  26620105      + |      CG       0:0
  [121817214] CMiso1.1chr12 len=26620111  26620106      - |      CG       0:0
  [121817215] CMiso1.1chr12 len=26620111  26620107      - |      CHG      0:0
  [121817216] CMiso1.1chr12 len=26620111  26620108      - |      CHH      0:0
               distance transitionContext
              <numeric>          <factor>
          [1]       Inf           NA     
          [2]         2           CHH-CHH
          [3]         3           CHH-CHH
          [4]         6           CHH-CHH
          [5]         0           CHH-CHH
          ...       ...               ...
  [121817212]         0           CHG-CHH
  [121817213]         3           CHH-CG 
  [121817214]         0           CG-CG  
  [121817215]         0           CG-CHG 
  [121817216]         0           CHG-CHH
  -------
  seqinfo: 13 sequences from an unspecified genome

Do you have any idea ? Thanks !

shahryary commented 4 years ago

Hi @El-Castor ,

Thanks for the running, I did't develop the methimpute package but I think the problem starting from "importBismark" function (one line above the "inflateMethylome" function), when starting to read chromosomes and cx-file. Please just check 'bismark.data' and be sure that you don't have any null values. My second suggestion (I'm not sure) : change your chromosome pattern from "CMiso1.1chr00" into "ch00" , etc..., in cx-report & Ref_chr.

El-Castor commented 4 years ago

Hi, thanks for your response. I don't think that is due to the chromosome pattern because when I check both bismark.data and cytosine.position R object, they have the same chromosome pattern name...

I have ask to the author of METHimpute 6 days ago but I have no response.

More over I have check the bismark-meth-extractor log and I saw some error as you can see bellow :

Finished generating genome-wide cytosine report

samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1

I think I have some trooble with the cx_report building. Because the error of the bismark.import() function is due to trooble with the seqnames of the two R objects. I'm sure that is not due to the chr name but when I check the numbers of row of two object they have not the same size and this can explain the error. But I don't understand why bismark cannot extract well ...

Do you have any suggestions please?

shahryary commented 4 years ago

Hi @El-Castor Thanks for your reply. I shared another R code with you in Dropbox (testV2.R), in this code I tried to customize some methimpute function ("importMethylpy") to read your file, I hope this will help you to solve the problem, please run the code line-by-line and let me know the result.

El-Castor commented 4 years ago

Hi @shahryary ,

I will test your code this afternoon, thank you for your reactivity. Otherwise, you are sure that the samtools error in the bismark_extractor step log is not affecting the METHimpute part? because we cannot load the cx_report and the log corresponding to the production of this file show the samtools error as you have seen in my last post.

Thanks

El-Castor commented 4 years ago

Hi @shahryary ,

I have test your new code with the importMethylpy() R function. I have this error message in output :

> bismark.data <- importMethylpy(going_file, chrom.lengths = Ref_Chr)

 *** caught segfault ***
address 0x2c, cause 'memory not mapped'

Traceback:
 1: fread(file, skip = skip, sep = "\t", colClasses = classes)
 2: importMethylpy(going_file, chrom.lengths = Ref_Chr)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 

I have in memory 60K of RAM and 4 cores for information....

El-Castor commented 4 years ago

Hi @shahryary ,

I have take a look and it seems I have some samtools error in the log of bismarck extractor a the end of the log. More over I have compared the seqlevels of the bismarck.data and the cytosine.position. and they have not the same length. Cytosine.position as 54 lines loss compared to bismark.data.

I have ask to the author of methimpute but i' still waiting.

Are you sure that I have not an issue in the extracting step regardless the log file :


Now processing chromosomes that were not covered by any methylation calls in the coverage file...
All chromosomes in the genome were covered by at least some reads. coverage2cytosine processing complete.

Finished generating genome-wide cytosine report

samtools view: writing to standard output failed: Broken pipe
samtools view: error closing standard output: -1

More over I have see on the bismark extractor log an other error :

Summarising Bismark methylation extractor parameters:
===============================================================
Bismark paired-end SAM format specified (default)
Number of cores to be used: 5
Output path specified as: /NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/bismark-meth-extractor/

Summarising bedGraph parameters:
===============================================================
Generating additional output in bedGraph and coverage format
bedGraph format:    <Chromosome> <Start Position> <End Position> <Methylation Percentage>
coverage format:    <Chromosome> <Start Position> <End Position> <Methylation Percentage> <count methylated> <count non-methylated>

Using a cutoff of 1 read(s) to report cytosine positions
Reporting and sorting methylation information for all cytosine context (sorting may take a long time, you have been warned ...)
The bedGraph UNIX sort command will use the following memory setting:   '40G'. Temporary directory used for sorting is the output directory

Summarising genome-wide cytosine methylation report parameters:
===============================================================
Generating comprehensive genome-wide cytosine report
(output format: <Chromosome> <Position> <Strand> <count methylated> <count non-methylated>  <C-context>  <trinucleotide context> )
Reporting methylation for all cytosine contexts. Be aware that this will generate enormous files
Using 1-based genomic coordinates (default)
Genome folder was specified as /NetScratch/cpichot/references/

Checking file >>/NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/bismark-deduplicate/R150mM_FDLM202341332-1a.bam<< for signs of file truncation...

But I verify my fq.gz with the md5 and they are ok. Do you think it's due to an issue in the mapping step?

here you have the end of the bismark mapping log :


Final Alignment report
======================
Sequence pairs analysed in total:   45454812
Number of paired-end alignments with a unique best hit: 23725481
Mapping efficiency: 52.2%

Sequence pairs with no alignments under any condition:  18078429
Sequence pairs did not map uniquely:    3650902
Sequence pairs which were discarded because genomic sequence could not be extracted:    22

Number of sequence pairs with unique best (first) alignment came from the bowtie output:
CT/GA/CT:   11825033    ((converted) top strand)
GA/CT/CT:   0   (complementary to (converted) top strand)
GA/CT/GA:   0   (complementary to (converted) bottom strand)
CT/GA/GA:   11900426    ((converted) bottom strand)

Number of alignments to (merely theoretical) complementary strands being rejected in total: 0

Final Cytosine Methylation Report
=================================
Total number of C's analysed:   1184941368

Total methylated C's in CpG context:    77781453
Total methylated C's in CHG context:    48492457
Total methylated C's in CHH context:    73849100
Total methylated C's in Unknown context:    709763

Total unmethylated C's in CpG context:  72834245
Total unmethylated C's in CHG context:  93411603
Total unmethylated C's in CHH context:  818572510
Total unmethylated C's in Unknown context:  1336333

C methylated in CpG context:    51.6%
C methylated in CHG context:    34.2%
C methylated in CHH context:    8.3%
C methylated in unknown context (CN or CHN):    34.7%

Deleting temporary report files...
R150mM_FDLM202341332-1a_paired_1.fq.gz.temp.1_bismark_bt2_PE_report.txt R150mM_FDLM202341332-1a_paired_1.fq.gz.temp.2_bismark_bt2_PE_report.txt R150mM_FDLM202341332-1a_paired_1.fq.gz.temp.3_bismark_bt2_PE_report.txt R150mM_FDLM202341332-1a_paired_1.fq.gz.temp.4_bismark_bt2_PE_report.txt 

Bismark completed in 0d 12h 22m 44s

====================
Bismark run complete
====================

Do you have any idea please ?

best,

shahryary commented 4 years ago

Hi @El-Castor The error from "samtools view" absolutely is not the problem. Also, your mapping output looks fine too (you have Mapping efficiency: 52.2% / Final Cytosine Methylation Report ). also last time we cheeked cx-reports and there are fine too so the only thing that we can focus is "methimpute" part. We ran many different custom genome with this pipeline (sometimes got the bug and we debugged), but the point is that it's really difficult to solve kind of this problem without having genome reference and sample ( I know your rights about the unpublished data). Please let me know if you have any questions.

El-Castor commented 4 years ago

Hi @shahryary , You right the samtools error was not the not the causes of the trooble. It was due to my Fasta chromosome name with a unwanted space after the chromosome name. Now I able to launch the model. Thanks for your help.

So I have try plotEnrichment and the other function after the model building and it's works only if I save the model and I relaunch clean R session because if i do not this I have an out of memory ( my configuration is 10 cores with 300000 Go of ram). So I have try to launch the analysis with your pipeline (step 9) and I have this error due to not enough space , 300Go should I have to increase it ?

HMM: Convergence reached!
Time spent in Baum-Welch: 8481.09s
Compiling results ... 230.67s
Erreur : mémoires 'cons' épuisées (limite atteinte ?)
De plus : Warning message:
messages d'avis perdus
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Exécution arrêtée
Erreur : mémoires 'cons' épuisées (limite atteinte ?)
sort: impossible de lire: /NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out/methimpute-out/file-processed.lst: Aucun fichier ou dossier de ce type

I'm trying with a session with 18 cores and 550Go of ram. Do you have any suggestion?

shahryary commented 4 years ago

@El-Castor Yeah, Increasing mount of the RAM is good idea, but I'm wondering what is genome type you are running that takes more than 300 GB of RAM. However, if you running methimpute part from the pipeline(not script that I shared in Dropbox), you can add two lines to remove some variables after used. In "src/bash/methimpute.R" remove the " bismark.data" after methylome<- inflateMethylome(bismark.data, cytosine.positions) OR "methylome" before the modifiedexportMethylome , I'm not sure how much of memory releasing.

El-Castor commented 4 years ago

@shahryary I used a genome of 450 megabase in fasta format. I have try with a session with 550 Go of ram and 18 cores...I am waiting the output, it is running still now.

Yes you right I was wondering to erase it, because I have see that the trouble of memory append when it try to parse and sort the model build. I tell when I have more information Thanks to respond

El-Castor commented 4 years ago

Hi @shahryary, Finally with the erasing of bismark.data and the methylome it's works fine without error. Thanks a lot for your help. I have a last question, I have no bigWig or bedgraph or dmr-caller format. At whicbh step do you produce these file on your pipeline ? Here you can see the output directory of the analysis :

(base) cpichot@cluster:/NetScratch/cpichot/WGBS_analysis/Zebularine_treatment_out$ ls -al ./*
./bedgraph-format:
total 4
drwxr-xr-x 1 cpichot utilisateurs   0 juin  22 17:17 .
drwxr-xr-x 1 cpichot utilisateurs 644 juin  22 17:17 ..

./bigwig-format:
total 4
drwxr-xr-x 1 cpichot utilisateurs   0 juin  22 17:17 .
drwxr-xr-x 1 cpichot utilisateurs 644 juin  22 17:17 ..

./bismark-deduplicate:
total 12900944
drwxr-xr-x 1 cpichot utilisateurs        496 juin  25 11:27 .
drwxr-xr-x 1 cpichot utilisateurs        644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs        204 juin  25 11:01 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 7269155442 juin  25 11:15 Mock_FDLM202341331-1a.bam
-rw-r--r-- 1 cpichot utilisateurs       3555 juin  25 11:15 Mock_FDLM202341331-1a.log
-rw-r--r-- 1 cpichot utilisateurs        356 juin  25 11:15 Mock_FDLM202341331-1a.txt
-rw-r--r-- 1 cpichot utilisateurs 5941377215 juin  25 11:26 R150mM_FDLM202341332-1a.bam
-rw-r--r-- 1 cpichot utilisateurs       3573 juin  25 11:27 R150mM_FDLM202341332-1a.log
-rw-r--r-- 1 cpichot utilisateurs        358 juin  25 11:26 R150mM_FDLM202341332-1a.txt

./bismark-mappers:
total 15548544
drwxr-xr-x 1 cpichot utilisateurs       2180 juin  25 00:21 .
drwxr-xr-x 1 cpichot utilisateurs        644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs        928 juin  23 20:18 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs        460 juin  25 00:21 list-finished.lst
-rw-r--r-- 1 cpichot utilisateurs 8777758439 juin  24 11:59 Mock_FDLM202341331-1a.bam
-rw-r--r-- 1 cpichot utilisateurs      41064 juin  24 11:59 Mock_FDLM202341331-1a_paired.log
-rw-r--r-- 1 cpichot utilisateurs       2099 juin  24 11:59 Mock_FDLM202341331-1a.txt
-rw-r--r-- 1 cpichot utilisateurs 7143833809 juin  25 00:21 R150mM_FDLM202341332-1a.bam
-rw-r--r-- 1 cpichot utilisateurs      39611 juin  25 00:21 R150mM_FDLM202341332-1a_paired.log
-rw-r--r-- 1 cpichot utilisateurs       2099 juin  25 00:21 R150mM_FDLM202341332-1a.txt

./bismark-meth-extractor:
total 2782076
drwxr-xr-x 1 cpichot utilisateurs      1992 juin  25 18:12 .
drwxr-xr-x 1 cpichot utilisateurs       644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs       212 juin  25 11:32 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 756720795 juin  25 14:57 Mock_FDLM202341331-1a.bedGraph.gz
-rw-r--r-- 1 cpichot utilisateurs 692208059 juin  25 14:57 Mock_FDLM202341331-1a.bismark.cov.gz
-rw-r--r-- 1 cpichot utilisateurs     30257 juin  25 15:12 Mock_FDLM202341331-1a.log
-rw-r--r-- 1 cpichot utilisateurs 735561604 juin  25 17:56 R150mM_FDLM202341332-1a.bedGraph.gz
-rw-r--r-- 1 cpichot utilisateurs 664268563 juin  25 17:56 R150mM_FDLM202341332-1a.bismark.cov.gz
-rw-r--r-- 1 cpichot utilisateurs     29315 juin  25 18:12 R150mM_FDLM202341332-1a.log

./cov-seq-reports:
total 16
drwxr-xr-x 1 cpichot utilisateurs 648 juin  25 10:59 .
drwxr-xr-x 1 cpichot utilisateurs 644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs 347 juin  25 10:59 BismarkMapper-report.log
-rw-r--r-- 1 cpichot utilisateurs 218 juin  25 10:38 list-files.lst

./cx-reports:
total 9723816
drwxr-xr-x 1 cpichot utilisateurs        412 juin  27 00:00 .
drwxr-xr-x 1 cpichot utilisateurs        644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs        240 juin  26 11:10 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 4478969107 juin  26 11:25 Mock_FDLM202341331-1a.CX_report.txt
-rw-r--r-- 1 cpichot utilisateurs  508592286 juin  26 23:59 Mock_FDLM202341331-1a_L1_1_val_1_bismark_bt2_pe.deduplicated.bismark.cov.gz.CpG_report.txt
-rw-r--r-- 1 cpichot utilisateurs 4463590432 juin  26 11:40 R150mM_FDLM202341332-1a.CX_report.txt
-rw-r--r-- 1 cpichot utilisateurs  506012083 juin  27 00:00 R150mM_FDLM202341332-1a_L2_1_val_1_bismark_bt2_pe.deduplicated.bismark.cov.gz.CpG_report.txt

./dmrcaller-format:
total 4
drwxr-xr-x 1 cpichot utilisateurs   0 juin  22 17:17 .
drwxr-xr-x 1 cpichot utilisateurs 644 juin  22 17:17 ..

./fit-reports:
total 56
drwxr-xr-x 1 cpichot utilisateurs   160 juil.  1 02:49 .
drwxr-xr-x 1 cpichot utilisateurs   644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs 23505 juin  30 19:55 fit_Mock_FDLM202341331-1a_All.pdf
-rw-r--r-- 1 cpichot utilisateurs 23732 juil.  1 02:49 fit_R150mM_FDLM202341332-1a_All.pdf

./gene-reports:
total 128
drwxr-xr-x 1 cpichot utilisateurs   268 juil.  1 04:05 .
drwxr-xr-x 1 cpichot utilisateurs   644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs  9946 juin  30 20:33 gene_Mock_FDLM202341331-1a_All.pdf
-rw-r--r-- 1 cpichot utilisateurs  9946 juil.  1 03:29 gene_R150mM_FDLM202341332-1a_All.pdf
-rw-r--r-- 1 cpichot utilisateurs 45906 juin  30 21:10 genes_Mock_FDLM202341331-1a_All.txt
-rw-r--r-- 1 cpichot utilisateurs 45925 juil.  1 04:05 genes_R150mM_FDLM202341332-1a_All.txt

./logs:
total 48
drwxr-xr-x 1 cpichot utilisateurs 420 juin  26 11:10 .
drwxr-xr-x 1 cpichot utilisateurs 644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs 201 juin  25 11:27 bismark-deduplicate.log
-rw-r--r-- 1 cpichot utilisateurs 880 juin  25 00:21 bismark-mapper.log
-rw-r--r-- 1 cpichot utilisateurs 215 juin  25 18:12 bismark-meth-extract.log
-rw-r--r-- 1 cpichot utilisateurs 216 juin  25 10:38 bismark-sorting.log
-rw-r--r-- 1 cpichot utilisateurs 370 juin  25 10:59 covseq.log
-rw-r--r-- 1 cpichot utilisateurs 441 juin  26 11:40 cx-report.log
-rw-r--r-- 1 cpichot utilisateurs   0 juin  23 10:36 gen-rdata.log
-rw-r--r-- 1 cpichot utilisateurs  43 juin  23 10:36 methimpute.log
-rw-r--r-- 1 cpichot utilisateurs 172 juin  25 00:37 qc-bam.log
-rw-r--r-- 1 cpichot utilisateurs 538 juin  23 10:36 qc-fastq.log
-rw-r--r-- 1 cpichot utilisateurs 132 juin  22 20:26 trimmomatic.log

./methimpute-out:
total 18437812
drwxr-xr-x 1 cpichot utilisateurs        388 juil.  1 10:37 .
drwxr-xr-x 1 cpichot utilisateurs        644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs  511834745 juin  29 18:18 bismark.data
-rw-r--r-- 1 cpichot utilisateurs  361010510 juin  29 18:20 cytosine.positions
-rw-r--r-- 1 cpichot utilisateurs        102 juil.  1 04:05 file-processed.lst
-rw-r--r-- 1 cpichot utilisateurs        103 juil.  1 10:37 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 6402260199 juin  30 19:53 methylome_Mock_FDLM202341331-1a_All.txt
-rw-r--r-- 1 cpichot utilisateurs 6394997432 juil.  1 02:48 methylome_R150mM_FDLM202341332-1a_All.txt
-rw-r--r-- 1 cpichot utilisateurs 4281662301 juin  29 17:54 model
-rw-r--r-- 1 cpichot utilisateurs  928530432 juin  29 18:26 .RDataTmp

./methylkit-format:
total 4
drwxr-xr-x 1 cpichot utilisateurs   0 juin  22 17:17 .
drwxr-xr-x 1 cpichot utilisateurs 644 juin  22 17:17 ..

./qc-bam-reports:
total 2764
drwxr-xr-x 1 cpichot utilisateurs    328 juin  25 00:37 .
drwxr-xr-x 1 cpichot utilisateurs    644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs    204 juin  25 00:21 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 630541 juin  25 00:30 Mock_FDLM202341331-1a_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 774320 juin  25 00:30 Mock_FDLM202341331-1a_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 627542 juin  25 00:37 R150mM_FDLM202341332-1a_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 774745 juin  25 00:37 R150mM_FDLM202341332-1a_fastqc.zip

./qc-fastq-reports:
total 11208
drwxr-xr-x 1 cpichot utilisateurs   1100 juin  23 10:36 .
drwxr-xr-x 1 cpichot utilisateurs    644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs    928 juin  23 10:11 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 616930 juin  23 10:18 Mock_FDLM202341331-1a_paired_1_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 766604 juin  23 10:18 Mock_FDLM202341331-1a_paired_1_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 638174 juin  23 10:25 Mock_FDLM202341331-1a_paired_2_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 790984 juin  23 10:25 Mock_FDLM202341331-1a_paired_2_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 638907 juin  23 10:25 Mock_FDLM202341331-1a_unpaired_1_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 808627 juin  23 10:25 Mock_FDLM202341331-1a_unpaired_1_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 645430 juin  23 10:25 Mock_FDLM202341331-1a_unpaired_2_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 811644 juin  23 10:25 Mock_FDLM202341331-1a_unpaired_2_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 618452 juin  23 10:30 R150mM_FDLM202341332-1a_paired_1_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 770915 juin  23 10:30 R150mM_FDLM202341332-1a_paired_1_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 633116 juin  23 10:36 R150mM_FDLM202341332-1a_paired_2_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 788078 juin  23 10:36 R150mM_FDLM202341332-1a_paired_2_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 634541 juin  23 10:36 R150mM_FDLM202341332-1a_unpaired_1_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 803324 juin  23 10:36 R150mM_FDLM202341332-1a_unpaired_1_fastqc.zip
-rw-r--r-- 1 cpichot utilisateurs 648108 juin  23 10:36 R150mM_FDLM202341332-1a_unpaired_2_fastqc.html
-rw-r--r-- 1 cpichot utilisateurs 815383 juin  23 10:36 R150mM_FDLM202341332-1a_unpaired_2_fastqc.zip

./rdata:
total 580
drwxr-xr-x 1 cpichot utilisateurs    192 juin  23 10:36 .
drwxr-xr-x 1 cpichot utilisateurs    644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs    298 juin  23 10:36 CMiso1.txt
-rwxr-xr-x 1 cpichot utilisateurs 314423 juin  23 10:36 genes.RData
-rw-r--r-- 1 cpichot utilisateurs    252 juin  23 10:36 Ref_Chr.RData
-rwxr-xr-x 1 cpichot utilisateurs 261746 juin  23 10:36 TEs.RData

./tes-reports:
total 80
drwxr-xr-x 1 cpichot utilisateurs   264 juil.  1 03:39 .
drwxr-xr-x 1 cpichot utilisateurs   644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs  6805 juin  30 20:06 TEs_Mock_FDLM202341331-1a_All.pdf
-rw-r--r-- 1 cpichot utilisateurs 26633 juin  30 20:43 TEs_Mock_FDLM202341331-1a_All.txt
-rw-r--r-- 1 cpichot utilisateurs  6805 juil.  1 03:03 TEs_R150mM_FDLM202341332-1a_All.pdf
-rw-r--r-- 1 cpichot utilisateurs 26633 juil.  1 03:39 TEs_R150mM_FDLM202341332-1a_All.txt

./trimmomatic-files:
total 12937080
drwxr-xr-x 1 cpichot utilisateurs        564 juin  22 20:26 .
drwxr-xr-x 1 cpichot utilisateurs        644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs        584 juin  22 18:43 list-files.lst
-rw-r--r-- 1 cpichot utilisateurs 3565756278 juin  22 19:40 Mock_FDLM202341331-1a_paired_1.fq.gz
-rw-r--r-- 1 cpichot utilisateurs 3639845652 juin  22 19:40 Mock_FDLM202341331-1a_paired_2.fq.gz
-rw-r--r-- 1 cpichot utilisateurs  103414048 juin  22 19:40 Mock_FDLM202341331-1a_unpaired_1.fq.gz
-rw-r--r-- 1 cpichot utilisateurs   41013094 juin  22 19:40 Mock_FDLM202341331-1a_unpaired_2.fq.gz
-rw-r--r-- 1 cpichot utilisateurs 2867029325 juin  22 20:26 R150mM_FDLM202341332-1a_paired_1.fq.gz
-rw-r--r-- 1 cpichot utilisateurs 2916278777 juin  22 20:26 R150mM_FDLM202341332-1a_paired_2.fq.gz
-rw-r--r-- 1 cpichot utilisateurs   78975417 juin  22 20:26 R150mM_FDLM202341332-1a_unpaired_1.fq.gz
-rw-r--r-- 1 cpichot utilisateurs   35225630 juin  22 20:26 R150mM_FDLM202341332-1a_unpaired_2.fq.gz

./trimmomatic-logs:
total 16
drwxr-xr-x 1 cpichot utilisateurs  176 juin  22 19:40 .
drwxr-xr-x 1 cpichot utilisateurs  644 juin  22 17:17 ..
-rw-r--r-- 1 cpichot utilisateurs 2608 juin  22 19:40 trimmomatic-log-Mock_FDLM202341331-1a.log
-rw-r--r-- 1 cpichot utilisateurs 1421 juin  22 20:26 trimmomatic-log-R150mM_FDLM202341332-1a.log

Do you know why I have these folder empty? How I can produce them? Does the DMR-caller forma can it used for differential analysis ? Which DMRcaller do you used please?

Thanks a lot best,

shahryary commented 4 years ago

@El-Castor Happy to hear that. you can convert to other formats from the main menu "Outputs/Reports" in the pipeline. it should be read automatically all methylome files from "methimpute-out" directory (as I see you generated two files). Thanks again for using our pipeline and your feedback.