Closed jasonsaunderswilliams closed 4 years ago
Hi @jasonsaunderswilliams , I could successfully run it in Docker:
docker run -v ${PWD}:/data -w /data cd240cd8fcef ViewBS MethOneRegion MethOverRegion --region TAIR10_GFF3_genes_chr1.bed --sample bis_WT.tab.gz,WT --sample bis_cmt23.tab.gz,cmt23 --prefix bis_gene_chr1_sample --context CG --outdir results_conda/MethOverRegion_Docker
Could you please try the test data on your server to see what will happen? Thanks.
Best, Shaojun
Hi Shaojun,
I would like to try the test data, however I can't see the files in your command in the data folder here. Could you direct me to which test data files you are referring to and I will duly run it.
Many thanks,
Jason
Hi Jason, Please download the test data here: https://github.com/xie186/ViewBS#download-test-data Best, Shaojun
Hi Shaojun,
I ran the test data succesfully this morning:
Subcommand: MethOverRegion
/inputfiles/bis_WT.tab.gz,WT /inputfiles/bis_cmt23.tab.gz,cmt23
/inputfiles/TAIR10_GFF3_genes_chr1.bed
It seemed that the reigons you provided are Genes. Please make sure of this.
Output directory is: /inputfiles
Output prefix: bis_gene_chr1_sample
Start calculate methylation information for target context
CG
.........
.........
Meth::OverRegion=HASH(0x2dad290): Usage: R --vanilla --slave --height <fig height> --width <fig width> --adjustXaxis <X> --input <input tab> --xlab <Gene> --output <output> < OverRegion.R
/usr/lib/R/bin/exec/R --slave --no-restore --file=/home/biodocker/ViewBS/lib/Meth/OverRegion.R --args --input /inputfiles/bis_gene_chr1_sample_MethOverRegion_CG.txt --xlab Gene --height 10 --width 10 --adjustXaxis 10 --output /inputfiles/bis_gene_chr1_sample_MethOverRegion_CG.pdf
Running time:206 wallclock secs (197.35 usr 1.31 sys + 1.34 cusr 0.46 csys = 200.46 CPU)
I noticed that the test data indices are .tbi and your sample files are tab.Mine are .txt and .csi respectively.
Best wishes,
Jason
Hi again Shaojun,
I attach (google drive) the files I'm working with before processing them. I suspect there is something wrong in the way I'm using tabix and the output files I'm getting. Do you sort files before zipping and then tabixing? Could you possibly send me the command you use to generate the .tbi files in the test data folder?
I can then try and perform these commands on the CpG_report files I get as output from Bismark.
All the best,
Jason
Hi Shaojun,
I've gone back to bismark following your instructions to generate the same files as you. I'll follow these and get back to you ;)
Best,
Jason
Thanks.
Hi again Shaojun,
So I followed the ViewBS github manual to process through Bismark. It's a bit confusing because ViewBS asks for the Genome-wide cytosine methylation report. These are generated as : bismark_bt2.CpG_report.txt files...
ubuntu@repro-group:~$ sudo head /home/ubuntu/work/Jason/bismarkgenomes/3944Swan_CS001_hg38_bismark_bt2.CpG_report.txt
chr19 60119 + 0 0 CG CGA
chr19 60120 - 0 0 CG CGG
chr19 60172 + 0 0 CG CGA
chr19 60173 - 0 0 CG CGA
chr19 60183 + 0 0 CG CGG
chr19 60184 - 0 0 CG CGT
chr19 60340 + 0 0 CG CGT
chr19 60341 - 0 0 CG CGT
chr19 60376 + 0 0 CG CGG
chr19 60377 - 0 0 CG CGT
These fit the
<chromosome> <position> <strand> <count methylated> <count unmethylated> <C-context> <trinucleotide context>
format as described by the bismark manual and yourselves.
The .cov.gz filetype generated from the bismark_methylation_extractor step are these:
chr19 60857 60857 80 4 1
chr19 60882 60882 100 5 0
chr19 95567 95567 100 1 0
chr19 103529 103529 93.3333333333333 28 2
chr19 103585 103585 57.5 23 17
chr19 103675 103675 70 21 9
chr19 103691 103691 86.6666666666667 26 4
chr19 103735 103735 73.3333333333333 22 8
chr19 119558 119558 100 1 0
chr19 146618 146618 0 0 1
These do not fit this format, but do have the .cov file extension.
Bismark says that the Genome-wide Cytosine methylation report (.txt file above) will be sorted by chromosome coordinate, but it isn't.
I cannot use tabix on either of these files nor have I been able to sort them.
Could you clear up for me which of the above Bismark outputs I should be using as input and also help me by letting me know how I can sort and tabix them appropriately so I can feed them into ViewBS?
Thanks very much for all your help.
Best wishes,
Jason
Hi again Shaojun,
I managed to solve my problems with a little manipulation of the files. I paste my code here for anyone with the same problems. Maybe it would be useful for you to clear up that the actual genome-wide cytosine reports are text files that - in my case - needed to be sorted then tabixed. Also, when downloading bed files for the --region input some manipulation was necessary to match the format of the test data input .bed files. I paste my code here in case it is useful to anyone starting with the unsorted genome-wide cytosine report .txt files:
bismark_methylation_extractor --multicore 5 --bedGraph /inputfiles/bismarkgenomes/4046Swan_Donor_hg38_bismark_bt2.bam --cytosine_report -o /inputfiles/bismarkgenomes --genome_folder /inputfiles/bismarkgenomes
2. Sort genome-wide Cytosine methylation report by chromosome coordinate:
sort -k1,1V -k2,2n /inputfiles/bismarkgenomes/4046Swan_Donor_hg38_bismark_bt2.CpG_report.tab -o /inputfiles/bismarkgenomes/ViewBS/4046Swan_Donor_hg38_bismark_bt2CpG_reportsorted.tab
3. bgzip through HTSlib
bgzip /inputfiles/bismarkgenomes/ViewBS/4046Swan_Donor_hg38_bismark_bt2CpG_reportsorted.tab
4. tabix through HTSlib to generate .tbi indices
tabix -p vcf /inputfiles/bismarkgenomes/ViewBS/3944.tab.gz
Hi,
I am attempting to perform the MethOveRegion function of ViewBS. I sorted, gzipped and tabixed the reference file I have "hg38promsorted.bed.gz". The index file for this "hg38promsorted.bed.gz.csi" is in the same folder as "hg38promsorted.bed.gz". I also tabixed "3944Swan_CS001_hg38.fastq.gz_bismark_bt2.CpG_report.txt", its index is also in the same folder. I should mention that I'm working inside a docker:
sudo docker run -ti --rm -v /home/ubuntu/work/Jason/:/inputfiles xie186/viewbs ViewBS MethOverRegion --region /inputfiles/hg38promsorted.bed.gz --sample /inputfiles/3944Swan_CS001_hg38.fastq.gz_bismark_bt2.CpG_report.txt,T0 --outdir /inputfiles --context CG
Subcommand: MethOverRegion /inputfiles/3944Swan_CS001_hg38.fastq.gz_bismark_bt2.CpG_report.txt,T0 /inputfiles/hg38promsorted.bed.gz
It seemed that the reigons you provided are Genes. Please make sure of this.
Output directory is: /inputfiles Default output prefix is used: MethOverRegion Start calculate methylation information for target context CG Tabix region lookup requires a gzipped, tabixed bedfile at /usr/local/lib/x86_64-linux-gnu/perl/5.22.1/Bio/DB/HTS/Tabix.pm line 40.
The error suggests that the .csi needs to be in that location? I don't understand enough about how the docker works to know where to put the tabix files or tell ViewBS where to find them Any workaround suggestions for this would be appreciated.
Best wishes,
Jason