Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
456 stars 152 forks source link

melanoma/melanocyte specific cell line with --cell_type #625

Closed cagaser closed 4 years ago

cagaser commented 5 years ago

Hello ensembl-vep community,

Is there a possibility to define a regulatory variants in melanoma or melanocyte specific cell line? I tried to list the available cell types in vep 96 but melanocyte/melanoma are not there. Is there any way I could define a regulatory variants in these cell line?

at7 commented 5 years ago

Hi, VEP supports custom annotations. If you have any bed or bigwig files with expression information from your cell lines you could at least annotate the overlap with the respective regions defined in your files.

Best regards, Anja

cagaser commented 5 years ago

Thank you for the reply,

I am aware of the custom annotations (--custom), however, we are interested in using --cell-type flag from VEP. I am not very knowledgeable about the Ensembl regulatory build, however I saw that melanoma-specific cell lines were included there from Roadmap epigenomics: foreskin melanocyte [https://www.ensembl.org/Homo_sapiens/Experiment/Sources?db=funcgen;ex=all;fdb=funcgen;r=17:46617590-46621119#ExperimentalMetaData]

Is there any way we could use this information?

at7 commented 5 years ago

If you are using cache files you can find a list of all available cell types in the info.txt file: homo_sapiens/98_GRCh38/info.txt. You can then provide a comma separated list of cell types with the --cell_type flag. VEP will only return regulatory variants for those cell types. Don't forget to also use the --regulatory flag.

cagaser commented 5 years ago

How come foreskin melanocyte is not included the list even though it's in regulatory build sources https://www.ensembl.org/Homo_sapiens/Experiment/Sources?db=funcgen;ex=all;fdb=funcgen;r=17:46617590-46621119#ExperimentalMetaData Can I somehow add this information in the cache file?

at7 commented 5 years ago

The following cell types should be available: foreskin_fibroblast_2,foreskin_keratinocyte_1,foreskin_keratinocyte_2,foreskin_melanocyte_1,foreskin_melanocyte_2. The ensembl funcgen team calculates regulatory feature based on a selection of different input data. And for each cell type the regulatory features are assigned labels to describe their activity levels. The cache files contain the regulatory features and VEP can calculate variant overlaps for those regulatory features and also assign in which cell types the regulatory feature is active. It is not possible to include any addition cell types into the cache files.

cagaser commented 5 years ago

Thank you very much for the quick reply! I check the list of the available cell types, unfortunately, no foreskin_melanocyte_1 nor foreskin_melanocyte_2 in the list:

I am using the VEP 96 GRCh38 cache and it doesn't seem to include the said cell type:

vep_cache/homo_sapiens/96_GRCh38⟫ cat info.txt | grep cell | awk '{print $2}'| grep foreskin

at7 commented 5 years ago

I'm sorry. I overlooked that you are working with release 96. The regulatory build is constantly updated. The best option would be if you could use the latest VEP code and annotation data.

cagaser commented 5 years ago

Thank you very much for your help! I will update my cache file to 98 then :)