Closed hputnam closed 7 years ago
By first suggestion is try different hardware...
the latter particularly sounds like that might be the case..
related and for comparison here is some bismark output from one of said files....
https://genomevolution.org/coge/ExperimentView.pl?eid=9227
https://genomevolution.org/coge/ExperimentView.pl?eid=9227
Can you provide a url to the SAM file you created with bismark?
http://owl.fish.washington.edu/symbiodinium/Oly_MBD/Bismark/
I have tried roadrunner and my computer and loading all files or just one file and still crashes. I have tried direct output of bismark bam files, and sorted bam and sam files. All end up crashing R.
I can get a bedgraph from the alignment, if that helps
https://sr320.github.io/SAM-lacked-header/
RELATED I think a lot of the issues with the alignment files could be that some info is missing in header. In the above example, it will not work if I do not provide the reference fasta.
@seanb80 here are the files to try in Bismark. http://owl.fish.washington.edu/nightingales/O_lurida/20160203_mbdseq/
Try some of the concatenated ones zr1394_1.fastq.gz zr1394_2.fastq.gz zr1394_3.fastq.gz zr1394_4.fastq.gz zr1394_5.fastq.gz zr1394_6.fastq.gz zr1394_7.fastq.gz etc...
The file names correspond to the sample names here https://github.com/hputnam/Oly_Oyster_DNA_Methylation HC is "treatment" 0 and SS is "treatment" 1
I used bowtie2 2.2.9 bismark 0.16.3 samtools 0.1.19 R 3.2.5
Found the 10k file in the repeat masker issue. Running the genome prep now!
./bismark_genome_preparation ~/Documents/BismarkData
./bismark --genome ~/Documents/BismarkData ~/Documents/BismarkData/zr1394_1.fastq.gz --output_dir ~/Documents/BismarkData/BismarkOutput
./bismark_methylation_extractor -s --scaffolds --merge_non_CpG --bedGraph --zero_based ~/Documents/BismarkData/BismarkOutput/zr1394_1_bismark_bt2.bam --output ~/Documents/BismarkData/BismarkOutput/
worked to completion on Emu! I'll upload the results to the first run to Owl and update with a link shortly.
http://owl.fish.washington.edu/scaphapoda/Sean/BismarkOutput.tar.gz
Would you like me to start the other combined files?
What were you thinking for the sort work around? Do you think it is best to just run it on Emu? If so I will have to load all the cleaned files.
I think it would be easier to run it on Emu, otherwise, someone posted a workaround where you installed GNUsort on OS X, and then symlinked it to look like Unix sort, which seems like a pain/non guaranteed workaround as I don't know about the particular differences about GNUsort vs unix sort.
I wrote a script that should iterate through all of the files and run the different bismark programs on the different files. Is the cleaning particularly intensive? If not, I could just throw that at the head of the script and reclean the files off of Owl.
For now can you concatenate s1-s6 for each of the 18 samples and run those concatenated files all the way through the methylation extractor step? Thanks! http://owl.fish.washington.edu/nightingales/O_lurida/20160203_mbdseq/
@seanb80 the Bismark methylation extractor step failed. Can you re-run it?
Have strange histograms with no peak at 0% methylation https://github.com/hputnam/Oly_Oyster_DNA_Methylation/blob/master/Notebooks/3_Clustering_Differential_Methylation_R.ipynb in comparison to expectation of peaks around 0% and 100% as shown in manual https://bioconductor.org/packages/devel/bioc/vignettes/methylKit/inst/doc/methylKit.html
My guess that in post alignment step there is a setting to keep, ignore 0 methylation, (ie in methratio.py -z)
On Fri, Dec 2, 2016 at 5:40 PM Hollie Putnam notifications@github.com wrote:
Have strange histograms with no peak at 0% methylation
https://github.com/hputnam/Oly_Oyster_DNA_Methylation/blob/master/Notebooks/3_Clustering_Differential_Methylation_R.ipynb in comparison to expectation of peaks around 0% and 100% as shown in manual
https://bioconductor.org/packages/devel/bioc/vignettes/methylKit/inst/doc/methylKit.html
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sr320/LabDocs/issues/357#issuecomment-264605910, or mute the thread https://github.com/notifications/unsubscribe-auth/AEPHt7eFWzvEv1t1rXi9asiFpGExjSnTks5rEMhtgaJpZM4K81Y- .
I have been testing the switch to Bismark on the 18 Oly MBD samples. In good news, in the alignment sam files there is data for scaffold numbers >10,000. I am failing, however, to get the data out of Bismark and into methylkit despite trying from both directions.
Issue 1 -Bismark - bismark_methyltion_extractor function fails to output bedgraph or coverage information Issue 2 - methylkit - processBismarkAln function will not load sorted sam or bam files directly into methylkit and crashes R
any thoughts @sr320 ?
See bottom of HP notebook post for more detail