petervangalen / MAESTER-2021

Scripts to reproduce the analysis of the MAESTER paper (https://www.nature.com/articles/s41587-022-01210-8).
MIT License
11 stars 5 forks source link

No coverage along the mitochondrial chromosome with published MAESTER data #6

Open smzt opened 1 year ago

smzt commented 1 year ago

Hi, I'm unsure whether my problem comes from the execution of maegatk or another piece of software included in the MAESTER pipeline so I also include the issue here in your github but it is posted in the maegatk github already https://github.com/caleblareau/maegatk/issues/12

I followed the steps to analyze MAESTER data as stated here https://github.com/petervangalen/MAESTER-2021 Once I merged the whole BAM file of the scRNASeq dataset (reads for all chromosomes) and the BAM file of the MAESTER dataset (only reads for chrM) I ran maegatk this way: maegatk bcall -b HQ_CBs.csv -c 20 -o NUEVO_maegatk -mr 3 -i GSM5534703_K562-BT142.bam -n sGSM5534703_K562-BT142 -z -ub UB -bt CB -so

Here is the resulting plot after running the code in MT_coverage.R: maester

I saw similar peaks in the supplementary information of the MAESTER paper (Figure 7a https://static-content.springer.com/esm/art%3A10.1038%2Fs41587-022-01210-8/MediaObjects/41587_2022_1210_MOESM1_ESM.pdf) indicating that those peaks belonged to the scRNASeq data so I must be missing something related to the processing of the MAESTER data but so far I have not found the problem.

Here is what my BAM files look like before merging both files to apply maegatk. scRNASeq BAM (SRA identifier SRR15598773): SRR15598773.lite.1.127471761 0 chr1 10019 1 91M * 0 0 TAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCAAACCCA ??????????????????????????????????????????????????????????????????????????????????????????? NH:i:4 HI:i:1 AS:i:87 nM:i:1 RG:Z:scRNASeq:0:1:unknow_flowcell:0 RE:A:I xf:i:0 CR:Z:ATCTTCATCCATCAGA CY:Z:???????????????? CB:Z:ATCTTCATCCATCAGA-1 UR:Z:TTTCTCTTAGTG UY:Z:???????????? UB:Z:TTTCTCTTAGTG

MAESTER BAM (SRA identifier SRR15598774): SRR15598774_6928992 16 chrM 1 255 59S181M * 0 0 CTGACGGGCCATCACGCCCACACCGCCCCCACGTTCCCCTGAAATCAGACCTCCCGAGGGATCACAGGTCTATCACCCTATTAACCCCTCACGGGAGCTCTCCATGCATGTGGTATTTTCGTCTGGGGGGTGTGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCCTATTATTTATCGCACCTACGTTCAATATTA ,,,,,F:FF:,FF,,,F:,F,:,,F:::,F,FFF,,::FF::,,,,,F,F,FF,FFF,,F,FF:F,F,,:,FFF:FFF,,F,,,FF,FF:,FF:F:FF::FFF,,FF:F,FFF,F,FFFFFFFFFFFFFF:FFF:,FFFFF,:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,FFFFFFFFFFFFFFFFFF,FF,FFF:FFFFFFFFFFF,F,:,F,FFFF:FFFFFF,FFF:FFF:F NH:i:1 HI:i:1 AS:i:173 nM:i:3 CB:Z:TGGAGGATCTTGTTAC-1 UB:Z:CACTTATTGTTA

I also opened the merged BAM file from scRNASeq and MAESTER to check the coverage along the chrM. chrM is very well covered. Here is the IGV image. scRNASeq_chrM_reads

I also include the content of the final directory. final.zip

Any help would be much appreciated.

Regards,

Sheila