pontussk / PMDtools

Compute postmortem damage patterns and decontaminate ancient genomes
GNU General Public License v3.0
15 stars 7 forks source link

Problem plotting merged reads #12

Open arenvale opened 1 year ago

arenvale commented 1 year ago

Hi! I am trying to use PMDtools to generate damage plots. I have paired-end sequencing reads and have merged them. The issue is that I only observe the peak towards 3' but not the 5', and I suspect it may have to do with the fact that I merged the reads. My code:

  samtools calmd -u sample.mapped.q30.sorted.rmdup.bam ref.fasta > sample.mapped.q30.sorted.rmdup.md.bam
  samtools view -h sample.mapped.q30.sorted.rmdup.md.bam | python2 pmdtools.0.60.py --first --requirebaseq 30 --CpG -n 100000 > sample.mapped.q30.sorted.rmdup.md.CpG.scores.txt
  samtools view -h sample.mapped.q30.sorted.rmdup.md.bam | python2 pmdtools.0.60.py --platypus > PMD_temp.txt
  R CMD BATCH plotPMD.v2.R

Does this make sense? Should I plot the damage using a a mapping file of the non merged reads? Thanks a lot. Valeria

pontussk commented 1 year ago

Hi Valeria,

Yes that sounds like a merging issue potentially. You could check the merging parameters, or maybe use another merger. AdapterRemoval is the one used in the nfcore-eager reproducible workflow I believe.

Best, Pontus

On Wed, 18 Jan 2023 at 18:27, arenvale @.***> wrote:

Hi! I am trying to use PMDtools to generate damage plots. I have paired-end sequencing reads and have merged them. The issue is that I only observe the peak towards 3' but not the 5', and I suspect it may have to do with the fact that I merged the reads. My code:

samtools calmd -u sample.mapped.q30.sorted.rmdup.bam ref.fasta > sample.mapped.q30.sorted.rmdup.md.bam samtools view -h sample.mapped.q30.sorted.rmdup.md.bam | python2 pmdtools.0.60.py --first --requirebaseq 30 --CpG -n 100000 > sample.mapped.q30.sorted.rmdup.md.CpG.scores.txt samtools view -h sample.mapped.q30.sorted.rmdup.md.bam | python2 pmdtools.0.60.py --platypus > PMD_temp.txt R CMD BATCH plotPMD.v2.R

Does this make sense? Should I plot the damage using a a mapping file of the non merged reads? Thanks a lot. Valeria

— Reply to this email directly, view it on GitHub https://github.com/pontussk/PMDtools/issues/12, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO76I4LWXYCQPKH4FRBJWDWTAY2XANCNFSM6AAAAAAT7NDNGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

arenvale commented 1 year ago

Good afternoon! Thank you very much for your reply and sorry for taking so long to respond. I have indeed used AdapterRemoval to merge the reads with the following code: AdapterRemoval --file1 "${NAME}"_R1.cutadapt.fastq.gz --file2 "${NAME}"_R2.cutadapt.fastq.gz --collapse-conservatively --minalignmentlength 11 --basename "${NAME}".adapterremoval

The damage result I got after do all the bioinformatics processing is as follows: LZ3-PMD_plot.frag.pdf

On the other hand, I tried without merging the reads, doing an alignment with aln+sampe, and the damage plots I got are the following (for the same sample): LZ3-PMD_plot.frag.pdf

The point is that this protocol would not be the most suitable for further analysis, so it would only be useful for me to demonstrate the presence of aDNA damage. Is there any change I can make to the merging parameters with AdapterRemoval or the plotting with PMD to get the "real" pattern? Do you think it would be correct to use the damage plot on the alignment without merging reads, even if it is not the one I use for further analysis?

Thank you very much for your help!

pontussk commented 1 year ago

Hi,

It may depend on the purpose for which you want to show these damage patterns, but to me they both demonstrate the ancient DNA is present in your library.

Best, Pontus

On Mon, 10 Apr 2023 at 21:31, arenvale @.***> wrote:

Good afternoon! Thank you very much for your reply and sorry for taking so long to respond. I have indeed used AdapterRemoval to merge the reads with the following code: AdapterRemoval --file1 "${NAME}"_R1.cutadapt.fastq.gz --file2 "${NAME}"_R2.cutadapt.fastq.gz --collapse-conservatively --minalignmentlength 11 --basename "${NAME}".adapterremoval

The damage result I got after do all the bioinformatics processing is as follows: LZ3-PMD_plot.frag.pdf https://github.com/pontussk/PMDtools/files/11193649/LZ3-PMD_plot.frag.pdf

On the other hand, I tried without merging the reads, doing an alignment with aln+sampe, and the damage plots I got are the following (for the same sample): LZ3-PMD_plot.frag.pdf https://github.com/pontussk/PMDtools/files/11193648/LZ3-PMD_plot.frag.pdf

The point is that this protocol would not be the most suitable for further analysis, so it would only be useful for me to demonstrate the presence of aDNA damage. Is there any change I can make to the merging parameters with AdapterRemoval or the plotting with PMD to get the "real" pattern? Do you think it would be correct to use the damage plot on the alignment without merging reads, even if it is not the one I use for further analysis?

Thank you very much for your help!

— Reply to this email directly, view it on GitHub https://github.com/pontussk/PMDtools/issues/12#issuecomment-1502282555, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABO76I3CROZXJQEDX64X2ALXARUYNANCNFSM6AAAAAAT7NDNGY . You are receiving this because you commented.Message ID: @.***>

arenvale commented 1 year ago

Hi! Yes, the purpose is to validate the presence of aDNA in the library. Thank you!