Closed smzt closed 1 year ago
Hi @smzt my apologies for the delay-- the issue is fortunately easy to fix. I should have noted in the README that you need to specify the reference genome for this:
mgatk call -i humanbam -o out -n glio -g hg19
the mgatk default is for rCRS which is 16569 base pairs; the glio data was aligned to hg18 (16571). If you run into issues where there are no reads for genotyping, it's typically a reference genome issue (either length of using MT
vs chrM
for the mitochondrial chromosome name convention).
Hello, Running mgatk from the venv3 environment, from the tests directory and following the instructions in the README.md, I executed: $ mgatk call -i humanbam -o out -n glio Tue Jul 04 13:08:24 CEST 2023: mgatk v0.6.8 Tue Jul 04 13:08:24 CEST 2023: NOTE: the samples below either have 0 mtDNA reads at the specified chromosome or are mapped to an incorrectly specified reference mitochondrial genome Tue Jul 04 13:08:24 CEST 2023: Will remove samples from processing: REMOVED: MGH97-P8-H02.mito REMOVED: MGH60-P6-B01.mito REMOVED: MGH97-P8-H03.mito REMOVED: MGH60-P6-A11.mito ERROR: Could not import any samples from the user specification. ERROR: check flags, logs, and input configuration (including reference mitochondrial genome); QUITTING
I slightly modified the command to avoid the above errors and gzip errors: Select jobs to execute... gzip: out/final/glio.A.txt: No such file or directory gzip: out/final/glio.C.txt: No such file or directory gzip: out/final/glio.G.txt: No such file or directory gzip: out/final/glio.T.txt: No such file or directory gzip: out/final/glio.coverage.txt: No such file or directory
So the command looks like this: $ nohup mgatk call -i humanbam -g hg19 -o out -n glio -z -so &> glio.log &
The -so option avoids the gzip errors but still the pipeline with the test data does not provide the output files that are described in the Wiki of mgatk.
Here is the log file glio.log
I also ran other lines in the README.md file to test mgatk but the tool also failed: $ nohup mgatk bcall -i barcode/test_barcode.bam -n bc1 -o bc1d -bt CB -b barcode/test_barcodes.txt -z -so &> bc1.log &
Here is the log file for this command bc1.log
This is the contest of the final directory under obc1d. chrM_refAllele.txt bc1.T.txt.gz bc1.G.txt.gz bc1.C.txt.gz bc1.coverage.txt.gz bc1.A.txt.gz bc1.depthTable.txt bc1.rds bc1.signac.rds
Files .variant_stats.tsv.gz, .cell_heteroplasmic_df.tsv.gz and *.vmr_strand_plot.png are always missing in the final folder. I've assumed these files should also be located in the final output directory or any other directory under the output folder I indicated in the executed command but I might be wrong.
I also opened an issue in maegatk https://github.com/caleblareau/maegatk/issues/11 because I thought the problem was with that tool but I see the reported problem/error es exactly the same as with mgatk.
Any help would be much appreciated.
Best regards,
Sheila