quinlan-lab / pathoscore

pathoscore evaluates variant pathogenicity tools and scores.
MIT License
21 stars 8 forks source link

inframe deletions, insertions, and splice vars added, and fixed bug for multiple CSQs #37

Closed jimhavrilla closed 6 years ago

jimhavrilla commented 7 years ago

These variations should be included as per examples like this one: https://www.nature.com/ng/journal/v45/n8/full/ng.2670.html

brentp commented 7 years ago

while you're looking at this, can you go through this list: https://github.com/samtools/bcftools/blob/develop/csq.c#L224 and see what else should be added?

jimhavrilla commented 7 years ago

Yeah so beyond those two the only other two we use in the CCR paper are splice acceptor and donor variants. IF they are also labeled a coding sequence variant. Depends if you want those in pathoscore or not.

Jim Havrilla PhD Candidate in Human Genetics, University of Utah Accelerated BS/MS in Biomedical Engineering, Drexel University '12, Concentration: Bioinformatics "Memory, comprehension, communication, motivation"

On Mon, Nov 6, 2017 at 6:32 PM, Brent Pedersen notifications@github.com wrote:

while you're looking at this, can you go through this list: https://github.com/samtools/bcftools/blob/develop/csq.c#L224 and see what else should be added?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quinlan-lab/pathoscore/pull/37#issuecomment-342346243, or mute the thread https://github.com/notifications/unsubscribe-auth/AGQ8BGXcBBOvNXCVwfhFMFYhWP3EkWTsks5sz7MtgaJpZM4QUHDz .

jimhavrilla commented 7 years ago

Also I think you're not handing variants with a BCSQ like this at all "inframe_deletion&splice_donor&start_lost". You check in the function call for isfunctional if the consequence is in the list ['stop_gained', 'stop_lost', 'start_lost', 'initiator_codon', 'rare_amino_acid', 'missense', 'protein_altering', 'frameshift'] which such a BCSQ would not be. So that's other variants we cover in the CCR code.

Jim Havrilla PhD Candidate in Human Genetics, University of Utah Accelerated BS/MS in Biomedical Engineering, Drexel University '12, Concentration: Bioinformatics "Memory, comprehension, communication, motivation"

On Mon, Nov 6, 2017 at 6:47 PM, Jim Havrilla semjaavria@gmail.com wrote:

Yeah so beyond those two the only other two we use in the CCR paper are splice acceptor and donor variants. IF they are also labeled a coding sequence variant. Depends if you want those in pathoscore or not.

Jim Havrilla PhD Candidate in Human Genetics, University of Utah Accelerated BS/MS in Biomedical Engineering, Drexel University '12, Concentration: Bioinformatics "Memory, comprehension, communication, motivation"

On Mon, Nov 6, 2017 at 6:32 PM, Brent Pedersen notifications@github.com wrote:

while you're looking at this, can you go through this list: https://github.com/samtools/bcftools/blob/develop/csq.c#L224 and see what else should be added?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quinlan-lab/pathoscore/pull/37#issuecomment-342346243, or mute the thread https://github.com/notifications/unsubscribe-auth/AGQ8BGXcBBOvNXCVwfhFMFYhWP3EkWTsks5sz7MtgaJpZM4QUHDz .

jimhavrilla commented 6 years ago

This works better because it deals with “&” cases of coding sequence variants specific to splice variants. It works exactly as intended as is and is the way we use it in the CCR definition of functional variants. As you wrote this code it would only work if “coding_sequence” is there.

On Tue, Dec 5, 2017 at 5:47 PM Brent Pedersen notifications@github.com wrote:

@brentp requested changes on this pull request.

can you simplify this? e.g. leave the original if statement as it was (or add another effect to it, I can't tell).

and then add another statement after it:

if "coding_sequence" in eff and any(x in eff for x in (...)): return True

and then add a comment explaining what that does and why it happens.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/quinlan-lab/pathoscore/pull/37#pullrequestreview-81386403, or mute the thread https://github.com/notifications/unsubscribe-auth/AGQ8BE3_7pANjgFur8scVosmh6lxouBFks5s9eQFgaJpZM4QUHDz .

-- Jim Havrilla PhD Candidate in Human Genetics, University of Utah Accelerated BS/MS in Biomedical Engineering, Drexel University '12, Concentration: Bioinformatics "Memory, comprehension, communication, motivation"