Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
149 stars 46 forks source link

[Feature request] Export pinned variants or filtered variants as VCF (not excel) #1915

Closed hassanfa closed 4 years ago

hassanfa commented 4 years ago

Export filtered or pinned variants as VCF for cancer cases to be able to feed it to other external softwares for annotation or reporting. It would be great to have some sort of export functionality as VCF. It should be a very simple VCF:

Ultimately this functionality can be extended across institute level to point out AN and AC field in INFO. But for now, a simple VCF without INFO and FORMAT can also work.

This is closely related to issue #1914 but I since VCF is a more complex than Excel, it needs its own issue.

dnil commented 4 years ago

Sorry to be a git, but wouldn't it make sense to also deliver original VCFs if the customer intends to do other analysis, eg via caesar? This delivery method already exists.

hassanfa commented 4 years ago

That's a very good candidate. But sometimes they filter according to a panel, e.g. a panel of 6 genes. Then these Variants can be downloaded and ran through another software to create a molecular report.

I agree that it can create unnecessary strain on Scout servers, but ultimately only handful variants will be exported after "expert review of a case on Scout".

This is of course open to discussion.

dnil commented 4 years ago

Wouldn't we rather want to include those softs in the pipe beforehand, or what are we talking about?!

hassanfa commented 4 years ago

The said software is behind close doors 😞

dnil commented 4 years ago

I would very much rather be working to openly implement what the black box does than create a redundant route into it.. 😜

northwestwitch commented 4 years ago

Ultimately this functionality can be extended across institute level to point out AN and AC field in INFO.

I'm not really sure I want to go this way either. It would be much easier to download the VCF and filter that, instead of rebuilding it. This is not the purpose of Scout.

hassanfa commented 4 years ago

I would very much rather be working to openly implement what the black box does than create a redundant route into it.. 😜

That would make two of us! 😛 But here we are.

@northwestwitch

Two things that brought this up:

I'm actually in strong favor of being able to generate a VCF alongside the report: One PDF report, and one VCF of the items in the report.

vwirta commented 4 years ago

The exported VCF would be the outcome of a clinical interpretation process, ie only selected variants would be exported. The downstream tool is a third-party tool for molecular tumour boards where the purpose of the tool is to identify therapy and clinical trial recruitment options for the patient.

The tool is used is outside the diagnostic labs, in this case by oncologists. It is essential that all variants are technically true, hence an interpretation / assessment step in Scout.

On 12 May 2020, at 16:20, Hassan Foroughi notifications@github.com wrote:

I would very much rather be working to openly implement what the black box does than create a redundant route into it.. 😜

That would make two of us! 😛 But here we are.

@northwestwitch https://github.com/northwestwitch Two things that brought this up:

Archiving a final VCF for cases that go into clinical trial is much easier than storing PDF. The idea is that VCF of final variants can act as part of digital documents of a patient/case to be independent of Scout. I'm actually in strong favor of being able to generate a VCF alongside the report: One PDF report, and one VCF of the items in the report.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Clinical-Genomics/scout/issues/1915#issuecomment-627376103, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAUGEGIZTIQROVVQ4B2C4TLRRFLLDANCNFSM4M63AFDQ.

dnil commented 4 years ago

That would entail exporting a small selected subset to VCF - preferentially the same viewed and assessed variants already targeted for reporting to pdf. Perfectly fine as far as features go, and a good supplement to pdf for returning to LIMS or archive databases / quality registries!

I would take the overall workflow with intending to run in another tool externally for another few rounds of discussion. It seems inevitable that we would benefit more from including at least a subset of the functionality of that other tool directly, perhaps worst case just allow queries to said resource during variant triage.

dnil commented 4 years ago

Perhaps we lack good tools to do the comparisons? And should develop one - possibly in the "varg" context, possibly something new?

hassanfa commented 4 years ago

There are good ones out there: hap.py, som.py, rtg, etc. IMHO we should try to use those first.

dnil commented 4 years ago

Ok, lets close this for a bit again perhaps, and revisit it when a more concrete use case arrives? I imagine RNA scoring for DNA WGS cases may be an early such, where one could imagine diffs and/or scoring on an aggregate level post genmod - or at least with an updated genmod.

dnil commented 4 years ago

Ahem, sorry, wrong issue.. This discussion was for https://github.com/Clinical-Genomics/scout/issues/2072 I guess?

hassanfa commented 4 years ago

haha, yes. too many tabs open. it was for #2072...

heronikdin commented 4 years ago

My bad, didnt know we already had one on here @northwestwitch . So excatly what @vwirta has written, we would need the VCF for the variants we trust. How is this process going? would you need anything else from me or @hassanfa ?

northwestwitch commented 4 years ago

So far we haven't started working on this, but perhaps one could add another option to the checkboxes on the variants page:

image

So add another button to save the selected variants as "trusted" or something, then one could download them from the same page?

heronikdin commented 4 years ago

That would be a good option! After marking, will the vcf be available for download on the actual case page?

dnil commented 4 years ago

How about link out to that db resource? Does it have a rest API eg? Or is it perhaps not changing a crazy lot from day to day so a flat file export and annotation with it that could be directly displayed in Scout would be feasible?

dnil commented 4 years ago

I have a strong feeling the next step will be copying and pasting info back from the other db into Scout..

hassanfa commented 4 years ago

Is MTBP safe and secure? I don't think we should export patient data to a server yet. Ideally we should have it available internally, if there is a need for it.

IMHO, exporting patient variant data to another server should be used with care.

northwestwitch commented 4 years ago

What's MTBP?

hassanfa commented 4 years ago

Molecular Tumor Board Portal

northwestwitch commented 4 years ago

Ah Ok! 👍

heronikdin commented 4 years ago

@dnil I dont know what API means... I think that @vwirta has summarised it very well, "the exported VCF would be the outcome of a clinical interpretation process, ie only selected variants would be exported", nothing else.

I have not worked with the portal myself, this is a "wish" from those I am working with to gain more clinical information from the data generated from scout (i.e. only the true variants). Once the variants are uploaded and the report created with therapeutic and clinical assesment data, all variants and information will be deleted immediately from mtb portal.

I fully understand if a VCF cant be created from selected variants, but just wanted to explore the option. Would be glad for any feedback :)

dnil commented 4 years ago

Hero and I have had a quick meeting with the developers of MTBP - thank you for setting it up Hero! - and it turns out this small VCF export was only intended for a new-customer trial run of a few cases on the public side of the portal. They then have access to proprietary data that could be made available / annotated / reported by other mechanisms to us after signing of agreements. This would be the standard procedure for them anyway. I think we can close this issue for now, at least with regards to upload-to-external - right @heronikdin? With all intention to open a new interact-with-MTBP project later on for Balsamic, MIP and/or Scout after evaluation. Don't worry, we're not ditching a local collaborative resource that easily @vwirta. The data they collect would make a world of sense if included in a ranking / prioritisation model run after or within the pipelines, and displayed in Scout - or just displayed in Scout.

If there is still an interest in using vcf-export as an adjunct to reports or for a machine readable archiving of the variants triaged - which is not a bad idea - perhaps we can return to this in another issue?