caleblareau / mgatk

mgatk: mitochondrial genome analysis toolkit
http://caleblareau.github.io/mgatk
MIT License
98 stars 25 forks source link

mgatk-del-find fails in visualisation step #77

Closed grasshoffm closed 5 months ago

grasshoffm commented 12 months ago

Describe the bug I ran mgatk-del-find on a set of BAM files. I removed any reads that were not on the mitochondrial genome. mgatk-del-find progresses to the visualisation part and then fails with the error message below. It also only produced the .clip.tsv and .SA.tsv files.

The error message: Error in filter(): i In argument: pos < 16569 & pos > 1. Caused by error in pos < 16569: ! comparison (<) is possible only for atomic and list types Backtrace: 1. ├─... %>% unique() 2. ├─base::unique(.) 3. ├─dplyr::filter(., pos < 16569 & pos > 1) 4. ├─dplyr:::filter.data.frame(., pos < 16569 & pos > 1) 5. │ └─dplyr:::filter_rows(.data, dots, by) 6. │ └─dplyr:::filter_eval(...) 7. │ ├─base::withCallingHandlers(...) 8. │ └─mask$eval_all_filter(dots, env_filter) 9. │ └─dplyr (local) eval() 10. └─base::.handleSimpleError(...) 11. └─dplyr (local) h(simpleError(msg, call)) 12. └─rlang::abort(message, class = error_class, parent = parent, call = error_call) Execution halted

To Reproduce I cannot provide the data necessary for reproduction, but I used the following code. source mgatk/venv3/bin/activate mgatk-del-find \ --input ${bams_use} \ --mito-chromosome "MT" \ --output ${OUTPUT}/${sample_use}/

Expected behavior I expected the function to produce the output files as in the vignette.

Desktop (please complete the following information):

caleblareau commented 11 months ago

thanks for the note-- my guess is that you are missing an R package dependency

does it work if you clone the repository and run the command on test data?

mgatk-del-find -i pearsonbam/CACCACTAGGAGGCGA-1.qc.bam
grasshoffm commented 11 months ago

Yes, with the test data it works, but the results are different from the vignette.

The number of clipped reads for 13095 is 62 and for 6073 it is 54. The read depth is also much lower.

Should I attach the results here?

caleblareau commented 11 months ago

The vignette summarizes over a full experiment whereas the test data is just a single cell-- so the differences in the abundance of the clipped reads is to be expected.

Given that the test data works ok, I don't think there's is a procedural issue in the code but probably something more to do with the experiment that you ran the software on originally.

Could you either attached the output files from the run or email them to me (clareau@stanford.edu)-- my guess is that there are limited, if any, spliced reads that were detected and that is what's causing the error in your original execution.

grasshoffm commented 11 months ago

Sorry for the confusion, but I did not mean that the results for my sample are different. I meant that I get different results for the test data. I will email you the output files.

I do not think, that I can share the results for my data, but the software does not find any secondary alignments for any positions. The positions have coverage and clip_count though.