xie186 / ViewBS

ViewBS - a powerful toolkit for visualization of high-throughput bisulfite sequencing data
GNU General Public License v3.0
83 stars 27 forks source link

Heatmap/Violin plot issues using amplicons as regions #9

Closed sdewell closed 7 years ago

sdewell commented 7 years ago

Hello,

Are there any known issues using a set of amplicon coordinates as a reference for the Heatmap functions? I'm using three amplicons as the 'genome' to which I've aligned, and most analyses are working well but when running the heatmap function I receive an error:

Error in sample.int(length(x), size, replace, prob) : cannot take a sample larger than the population when 'replace = FALSE' Calls: tryCatch ... tryCatchList -> tryCatchOne -> -> sample -> sample.int Execution halted

I realize it's an R error but not sure how to fix the data structure issues feeding into the R script. There are 22 samples, and three amplicons ~100-175bp. The violin plots are not being created as well. I've previously used a whole genome reference, and then intersected the .cov files with my coordinates for the bis_rep files; this works but is rather slow and takes a lot of disk space for large sample numbers.

Thanks for any suggestions!

readbio commented 7 years ago

Hi, can you provide the table generated by ViewBS? Then I can test the R code and fix the problem. Thanks.

sdewell commented 7 years ago

Here's my command and output: /usr/bin/ViewBS/ViewBS MethHeatmap --verbose --region ../amplicons.bed --sample 01-trimmed_bismark.bis_rep.cov.gz,01 --sample 02-trimmed_bismark.bis_rep.cov.gz,02 --sample 03-trimmed_bismark.bis_rep.cov.gz,03 --sample 04-trimmed_bismark.bis_rep.cov.gz,04 --sample 05-trimmed_bismark.bis_rep.cov.gz,05 --sample 06-trimmed_bismark.bis_rep.cov.gz,06 --sample 07-trimmed_bismark.bis_rep.cov.gz,07 --outdir MethHeatmap --context CHG --prefix CHG

Subcommand: MethHeatmap 01-trimmed_bismark.bis_rep.cov.gz,01 02-trimmed_bismark.bis_rep.cov.gz,02 03-trimmed_bismark.bis_rep.cov.gz,03 04-trimmed_bismark.bis_rep.cov.gz,04 05-trimmed_bismark.bis_rep.cov.gz,05 06-trimmed_bismark.bis_rep.cov.gz,06 07-trimmed_bismark.bis_rep.cov.gz,07 ../amplicons.bed Output directory is: MethHeatmap Output prefix: CHG Start calculate methylation information for target context CHG Error in sample.int(length(x), size, replace, prob) : cannot take a sample larger than the population when 'replace = FALSE' Calls: tryCatch ... tryCatchList -> tryCatchOne -> -> sample -> sample.int Execution halted Meth::Heatmap=HASH(0x1957810): WARNING: unknown option '--input'

ARGUMENT '/home/scottd/projects/methylation_apr_2017/trim_galore/combined/amplicons_ref/MethHeatmap/CHG_MethHeatmap_CHG.txt' ignored

WARNING: unknown option '--output1'

ARGUMENT '/home/scottd/projects/methylation_apr_2017/trim_galore/combined/amplicons_ref/MethHeatmap/CHG_MethHeatmap_CHG.pdf' ignored

WARNING: unknown option '--output2'

ARGUMENT '/home/scottd/projects/methylation_apr_2017/trim_galore/combined/amplicons_ref/MethHeatmap/CHG_MethHist_CHG.pdf' ignored

WARNING: unknown option '--cluster_cols'

ARGUMENT 'FALSE' ignored

WARNING: unknown option '--cluster_rows'

ARGUMENT 'TRUE' ignored

WARNING: unknown option '--height'

ARGUMENT '10' ignored

WARNING: unknown option '--width'

ARGUMENT '10' ignored

WARNING: unknown option '--height2'

ARGUMENT '10' ignored

WARNING: unknown option '--width2'

ARGUMENT '10' ignored

WARNING: unknown option '--random_region'

ARGUMENT '20' ignored

Useage: R --vanilla --slave --input <> --height --width --height2 --width2 --random_region --output1 --output2 < trans_DMS_heatmap.R /usr/lib/R/bin/exec/R --vanilla --slave --input /home/scottd/projects/methylation_apr_2017/trim_galore/combined/amplicons_ref/MethHeatmap/CHG_MethHeatmap_CHG.txt --output1 /home/scottd/projects/methylation_apr_2017/trim_galore/combined/amplicons_ref/MethHeatmap/CHG_MethHeatmap_CHG.pdf --output2 /home/scottd/projects/methylation_apr_2017/trim_galore/combined/amplicons_ref/MethHeatmap/CHG_MethHist_CHG.pdf --cluster_cols FALSE --cluster_rows TRUE --height 10 --width 10 --height2 10 --width2 10 --random_region 20 FALSE TRUE

Running time: 0 wallclock secs ( 0.01 usr 0.00 sys + 0.29 cusr 0.03 csys = 0.33 CPU)

The txt file is all NA 01 02 03 04 05 06 07 chr19_58552755_58552920 NA NA NA NA NA NA NA chr11_119089504_119089643 NA NA NA NA NA NA NA chr22_24572821_24572937 NA NA NA NA NA NA NA

Example lines from bgzipped/tabix'ed cov file: chr19 10 - 64 40 CHG CTG chr19 11 + 18988 1 CHH CCA chr19 12 + 20073 7 CHH CAT chr19 16 + 22070 7 CHH CCT chr19 17 + 21995 49 CHH CTC

My bed file is standard and workd for other analyses. There are reads/methylation information outside of the bed coordinates, but that shouldn't be a problem. Appreciate your help and thanks for the tool!

readbio commented 7 years ago

Thanks for your information. Since I can't repeat your command line, I can just guess the possible reason.

One possibility is that ViewBS MethHeatmap will exclude the sites with read depth < 5 by default. So if you use the default parameters and all the read depths of the given regions are below 5, no information will be return. So maybe what you can do is set "--minDepth" as 1 and try again.

See the corresponding code below:

   if(!$opts_sub->{"minDepth"}){
        $opts_sub->{"minDepth"} = 5;
    }

Please let me know if this works or not.

sdewell commented 7 years ago

Thank you - the minDepth helped, though I had also made an error in my bismark cov file that was compounding the errors. Appreciate the responses!

readbio commented 7 years ago

Glad to hear that you fix your problem. Please let us know if you have any questions or suggestions. These can help us to improve ViewBS in the future. Thanks.