bbglab / intogen-plus

a framework for automatic and comprehensive knowledge extraction based on mutational data from sequenced tumor samples from patients.
https://www.intogen.org/search
Other
0 stars 1 forks source link

IntOGen plus | Update to MANE transcript #2

Closed FedericaBrando closed 6 months ago

FedericaBrando commented 1 year ago

TO DO (Federica): Run some cohorts as tests, updating the pipeline → Filter for MANE-Select → For the moment do NOT include the MANE- plus clinical

  1. Update VEP to last version (v110?)
  2. Filter by MANE-Select Start with some cohorts to detect possible errors
  3. Think about how to include MANE-Plus Clinical
FedericaBrando commented 1 year ago
FedericaBrando commented 1 year ago

analysis of differences:

FedericaBrando commented 1 year ago
FedericaBrando commented 11 months ago

Image

FedericaBrando commented 11 months ago

maybe has something to do with :

Image

FedericaBrando commented 11 months ago

Image

Everytime the container vep is run we have this warning:

found github issue on their web:

this PR will fix the warning on next release (v111)

FedericaBrando commented 11 months ago

Therefore the error (warning) is in the container. The problem is that, since we build the cache dir with a loop through chunks where we call the vep container multiple time, we have a smartwatch warning for each call.

FedericaBrando commented 10 months ago

Run with MANE transcript:

FedericaBrando commented 10 months ago

waiting for monica to analyze

FedericaBrando commented 10 months ago

Some genes are no found in the datasets, Monica and I found out it is because they are filtered out in the ParseVep step. Specifically, an expected mane gene that is supposed to be annotated as mane, it is not.

I open an issue on vep github to report the bug.

To overcome the issue, we decided to use two conditional filtering:

if the transcript does not have MANE, then use the canonical.

FedericaBrando commented 8 months ago

Ensembl was updated to v111.

FedericaBrando commented 6 months ago

Something to keep in mind is that with the update of Ensembl v111 some gene name will change.

FedericaBrando commented 6 months ago

TODO

updat boostdm DriverSaturation to use MANE


DriverSaturation uses canonical.regions.tsv