Closed VinceLiAB closed 1 year ago
Sorry to hear that @VinceLiAB. If you run 'usher --version
' what is the output?
The usher version was definitely the culprit. I updated to v0.6.1 from v0.6.0 and it is working again. Thanks for the quick response!
Great, glad it's working for you now!
@AngieHinrichs v0.6.1 still seems to have a problem with small test input.
Simply running pangolin pangolin/data/reference.fasta
causes usher-sampled to hang.
This is why the tests for the bioconda recipe update are failing.
Oof, thanks @wm75! I tested with tests/test-data/sequence1.fasta which has a single sequence... but I did not test with reference.fasta! -- which leads to a VCF file with no data lines (no mutations), which might be triggering some corner case in usher-sampled. @yceh can you please take a look? Here is the header-only VCF file that is causing usher-sampled to hang:
##fileformat=VCFv4.2
##reference=/data/tmp/tmp1tcryjgq/sequences.withref.fa:outgroup_A
##source=faToVcf /data/tmp/tmp1tcryjgq/sequences.withref.fa /data/tmp/tmp1tcryjgq/sequences.aln.vcf
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 5a7f5aa9677f248abcb2bedf90d7f3e2
Ah, that makes a lot of sense, thanks! Unfortunately, the bioconda test cannot use the test-data sequence because that's not getting installed. We could introduce a SNP at runtime though for now if that's all that's needed to make the test work, then revert the patch when you have an usher fix.
We could introduce a SNP at runtime though for now if that's all that's needed to make the test work, then revert the patch when you have an usher fix.
Yes, it is sufficient to change just a single base in the sequence before using it as a test input!
Ah, nice idea with the patch! Something like this should work:
sed -e 's/ACATGGTTTAGTCAGCGTGG/ACATGGTTTAGCCAGCGTGG/' pangolin/data/reference.fasta > $tempFasta
[Edit: NM I see you found your own 😁]
@yceh and @yatisht have already fixed it and released usher v0.6.2: https://github.com/yatisht/usher/releases/tag/v0.6.2
Thanks @AngieHinrichs @yceh @yatisht! The bioconda packages for usher 0.6.2 and for pangolin 4.2 using 0.6.2 of usher are now available.
pangolin 4.2 will also appear on usegalaxy.eu later today, together with pangolin 4.1.3 pinned to the same core dependencies, i.e. both Galaxy tool versions will use:
This way comparisons between usher and usher-sampled should be relatively simple.
Great, thanks so much @wm75!
comparisons between usher and usher-sampled
Just for the record, results should be overall very consistent but not identical, especially when sequences have Ns in lineage-defining positions. usher may place a sequence on a node that starts a lineage even if it has only Ns at the defining mutations (the mutations on the node that starts the lineage), but usher-sampled doesn't match all-Ns on the node at the end of the path -- it places it on the parent of that node, so in cases like that the sample will be assigned the parental lineage by usher-sampled. Also, usher would find some redundant equally parsimonious placements (EPPs) while usher-sampled is more stringent, so in cases where multiple EPPs would cause different assignments and pangolin takes a vote, the outcomes can be different. [Next on my list: get rid of the voting; with amplicon dropout issues it's looking like a bad idea now, see #492.]
Hi,
Pangolin v4.2 analysis seems to be stuck on the step "Using UShER as inference engine.". I have tried analyzing different sets of data ranging from 20 to 90 samples and they all stop at the same step.
No error messages are given and I didn't encounter this issue prior to the update.
Thank you.