Closed TCLamnidis closed 3 months ago
nf-core lint
overall result: Passed :white_check_mark: :warning:Posted for pipeline commit d39a10a
+| ✅ 246 tests passed |+
#| ❔ 1 tests were ignored |#
!| ❗ 22 tests had warnings |!
TODOs:
--exons
tsv file. Started working on it locally)Now adding pileupcaller. ANGSD calling still lacks a module.
Waiting on https://github.com/nf-core/test-datasets/pull/1066 once that is merged, I can run manual tests for pileupcaller using multi-reference too (to check for .combine duplications) From limited multiref tests, it seems the mpileup somehow gets confused giving:
Command error:
[mpileup] 3 samples in 4 input files
samtools mpileup: error reading from input file
Will need to lookinto the specific bam files used as input and see what's what
Previous comment wrror was caused by lack of library merging, meaning multiple bams per sample were supplied. not an issue with genotyping, but lacking of previous step.
Added manual tests. All passed. (maybe check with ssdna also for pileupcaller. outstanding TODOs:
Added merging of eigenstrat genotype datasets per reference across strandedness.
For some reason, the BAM input in -profile test_multiref
causes samtools mpileup to create no output and instead throw an error. The BAM passes samtools quickcheck
just fine, so I don't think it is inherently broken. More likely, the shortened reference not matching the header of the BAM is the issue.
Added genotypers to test commands, and pipeline errors when no snp/bed file provided but pileupcaller requested.
Locally I get some weird errors about failing to index the input BAM from the test
profile. Never seen that before and don't think I changed anything that would cause that error, so wondering if CI will reproduce that behaviour.
EDIT: It seems that resuming the run makes the samtools index process work fine. visible confusion
It seems the input BAM with Mammoth mtDNA reads has an outdated RG tag info (bad SM, no LB, wrong ID) as it is a very old eager output. I updated it and will try again once test-datasets is fixed.
TODOs:
reference_ploidy
option that goes directly in the reference meta. Will need updating in reference sheet in test-datasets too. ( https://github.com/nf-core/test-datasets/pull/1109 )GATK_UG module needs updating ( https://github.com/nf-core/modules/pull/5017 ✅ ) and that needs updating of mulled container ( https://github.com/BioContainers/multi-package-containers/pull/2992 ✅ )
TODOs:
This needs rereview. I did not address the Freebayed BED file issue. We will not implement it now (as it would be a new feature anyway), and only implement it if it is requested.
GATK4_HAPLOTYPECALLER now fails because the input BAM has a different sample name in its RG than produced by the MAP
SWF.
Updating test-datasets to fix this. https://github.com/nf-core/test-datasets/pull/1141 ✅
TODOs:
gatk_dbsnp
(Will need tweaking once dbsnp gets a meta)genotyping_gatk_dbsnp
needs to fit a regex pattern (*.vcf
nogz
)Progress:
PR checklist
scrape_software_versions.py
nf-core lint .
).nextflow run . -profile test,docker
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).