hputnam / Meth_Compare

5 stars 2 forks source link

Describe gene mapping differences across methods #3

Closed sr320 closed 4 years ago

sr320 commented 4 years ago

look at reports

mgavery commented 4 years ago

Questions:

  1. Are the fragment (insert) sizes the same for all libraries? How does bismark (bowtie) get the information about insert size, can't this affect mapping?

  2. Is read quality different - this should be obvious from reports

  3. Run fastqc on the trimmed reports, this should help compare

kubu4 commented 4 years ago

Are the fragment (insert) sizes the same for all libraries? How does bismark (bowtie) get the information about insert size, can't this affect mapping?

Is read quality different - this should be obvious from reports

Run fastqc on the trimmed reports, this should help compare

Already done:

MultiQC report for RRBS/WGBS:

https://gannet.fish.washington.edu/Atumefaciens/20200305_methcompare_fastp_trimming/multiqc_report.html

mgavery commented 4 years ago

Why is MBD-BS-Seq not mapping as well? -one possibility is that symbiont DNA is highly methylated and is getting preferentially pulled down relative to host DNA for this method.
-test this by mapping to some type of in silico symbiont genome

mgavery commented 4 years ago

symbiodinium genome and methylation references: https://www.nature.com/articles/s42003-018-0098-3 Symbiodinium_genomes_reveal_adaptive_evolution_of_functions_related_to_coral-dinoflagellate_symbiosis___Communications_Biology

kubu4 commented 4 years ago

Consider sorting reads taxonomically before mapping?

mgavery commented 4 years ago

How do you do that?

kubu4 commented 4 years ago

I've done this using MEGAN6 to separate dinoflagellates from crab sequences.

kubu4 commented 4 years ago

Basically, do some BLASTing and InterProScan stuff and then the software assigns taxonomies to individual reads, then allows you to pull reads at any level of taxonomic hierarchy.

hputnam commented 4 years ago

We were able to use Sym C1 for tests of alignments in the Pacuta data.