macarthur-lab / clinvar

This repo provides tools to convert ClinVar data into a tab-delimited flat file, and also provides that resulting tab-delimited flat file.
Other
122 stars 55 forks source link

Need to Update pathogenic & conflict boolean regex. #45

Closed raymond301 closed 7 years ago

raymond301 commented 7 years ago

Looks like Clinvar has new string added to clin_sig: "Conflicting interpretations of pathogenicity"

I just handle it by changing these lines in join_data.R

combined$pathogenic = as.integer(grepl('athogenic\\b',combined$clinical_significance))

# conflicted = 1 if at least one submission each of [likely] benign and [likely] pathogenic
combined$conflicted = as.integer((grepl('athogenic',combined$clinical_significance) & grepl('enign',combined$clinical_significance)) | grepl('Conflicting interpretations',combined$clinical_significance, ignore.case=TRUE))

That way you get better boolean values.

XiaoleiZ commented 7 years ago

The new release has already updated the processing.