Closed rhdolin closed 5 years ago
PharmCAT will not be supporting this.
We want people to be 100% clear on how PharmCAT works and what happens with the data you provide to it. It does not accept arbitrary VCF for many reasons (see VCF Requirements for the full list of requirements), but the main one is that we will not make any assumptions on the input you provide. We have already encountered users making assumptions on how PharmCAT works or should work which has led to confusion down the line.
For one thing, we do not know what "wild type" is because it can vary based on your reference sequence. Did you convert it from GRCh37 to GRCh38? If so, the "wild types" from the two could have changed and your VCF would not provide any indications that this is the case. Secondly, a missing entry can mean that the reference base was detected OR it can mean the base was not assayed or has no call. We cannot distinguish between uncalled positions and reference in a VCF file. So we ask that you declare each required position for PharmCAT to be clear about the input.
You have to decide on how accurate you want the data you provide to PharmCAT should be, especially if you're making any clinical decisions based on PharmCAT's results. If you wish to make assumptions of your data, you are welcome to do so. Instructions on how to do this can be found here.
Hi Mark,
Thanks for your response. I understand your reasoning.
That said, I still feel that it can be tricky to implement PharmCAT, and that there might be some room for Postel's Law (https://en.wikipedia.org/wiki/Robustness_principle) here...
Consider this:
Are there other options that can make it easier for us to experiment with PharmCAT, while minimizing the risk of inappropriate use?
it can be tricky to implement PharmCAT...
I think that's an understatement. :) PharmCAT comes with a whole host of caveats, both in using it and understanding the reports it produces.
PharmCAT should be one part of your pgx pipeline. It is not meant to be a plug and play type tool.
"PharmCAT will not make assumptions about your input" is a hard rule and will not change.
That's not to say you don't have a point. We have talked about adding other tools to flush out the pipeline to make it easy to massage VCF files to be PharmCAT-ready. These would be completely separate tools and not flags on PharmCAT.
Unfortunately we do not have the bandwidth to do so at this time. If you would be willing to contribute code to do this, we would be more than happy to either include it or link to it.
Do you want to request a feature or report a bug? Request a feature
What is the current behavior? https://github.com/PharmGKB/PharmCAT/wiki/Preparing-VCF-Files#all-positions-needed-even-if-00-or-
If the current behavior is a bug, please provide the steps to reproduce and, if possible, your example input data via a Gist or similar.
What is the expected behavior? Can you please add a parameter, to treat missing positions as wild type?
What is the motivation / use case for changing the behavior? Most of the VCFs I work with do not contain wild type calls. It's onerous to have to modify every VCF file before testing with PharmCAT.
Please tell us about your environment:
Other information (e.g. detailed explanation, stacktraces, related issues, suggestions how to fix, links for us to have context, eg. stackoverflow, gitter, etc) Thank you!!