svm-ai / svm-hackathon

5 stars 0 forks source link

COSMIC Card #22

Open SSU02 opened 1 year ago

SSU02 commented 1 year ago

Description of database

COSMIC – the Catalogue of Somatic Mutations in Cancer – is the world's largest source of expert manually curated somatic mutation information relating to human cancers. Available for both GRCh37 and GRCh38.

Access (API or download)

Free registration for academic use; commercial use requires a licence.

File format

tsv files Sample of COSMIC data is available for download here: https://cancer.sanger.ac.uk/cosmic/about (first 100 lines of each of the download files freely available )

Investigation with dataset

(I downloaded the GRCh37 data sample)

List of tsv files:

  1. Actionability_AllData_v9_GRCh37.tsv
  2. CancerMutationCensus_AllData_v98_GRCh37.tsv
  3. CellLinesProject_CompleteCNA_v98_GRCh37.tsv
  4. CellLinesProject_CompleteGeneExpression_v98_GRCh37.tsv
  5. CellLinesProject_GenomeScreensMutant_v98_GRCh37.tsv
  6. CellLinesProject_MutationTracking_v98_GRCh37.tsv
  7. CellLinesProject_NonCodingVariants_v98_GRCh37.tsv
  8. CellLinesProject_RawGeneExpression_v98_GRCh37.tsv
  9. CellLinesProject_Sample_v98_GRCh37.tsv
  10. Cosmic_Breakpoints_v98_GRCh37.tsv
  11. Cosmic_CancerGeneCensusHallmarksOfCancer_v98_GRCh37.tsv
  12. Cosmic_CancerGeneCensus_v98_GRCh37.tsv
  13. Cosmic_Classification_v98_GRCh37.tsv
  14. Cosmic_CompleteCNA_v98_GRCh37.tsv
  15. Cosmic_CompleteDifferentialMethylation_v98_GRCh37.tsv
  16. Cosmic_CompleteGeneExpression_v98_GRCh37.tsv
  17. Cosmic_CompleteTargetedScreensMutant_v98_GRCh37.tsv
  18. Cosmic_Fusion_v98_GRCh37.tsv
  19. Cosmic_Genes_v98_GRCh37.tsv
  20. Cosmic_GenomeScreensMutant_v98_GRCh37.tsv
  21. Cosmic_MutantCensus_v98_GRCh37.tsv
  22. Cosmic_MutationTracking_v98_GRCh37.tsv
  23. Cosmic_NonCodingVariants_v98_GRCh37.tsv
  24. Cosmic_ResistanceMutations_v98_GRCh37.tsv
  25. Cosmic_Sample_v98_GRCh37.tsv
  26. Cosmic_StructuralVariants_v98_GRCh37.tsv
  27. Cosmic_Transcripts_v98_GRCh37.tsv

Of these files 2)CancerMutationCensus_AllData_v98_GRCh37.tsv has information regarding pathogenicity and mutation type.