Closed stephanieshong closed 1 year ago
allow string type in local_schema.py
"note": {
"NOTE_ID": T.StringType(),
"PERSON_ID": T.StringType(),
"NOTE_DATE": T.DateType(),
"NOTE_DATETIME": T.TimestampType(),
"NOTE_TYPE_CONCEPT_ID": T.IntegerType(),
"NOTE_CLASS_CONCEPT_ID": T.IntegerType(),
"NOTE_TITLE": T.StringType(),
"NOTE_TEXT": T.StringType(),
"ENCODING_CONCEPT_ID": T.IntegerType(),
"LANGUAGE_CONCEPT_ID": T.IntegerType(),
"PROVIDER_ID": T.IntegerType(),
"VISIT_OCCURRENCE_ID": T.LongType(),
"VISIT_DETAIL_ID": T.LongType(),
"NOTE_SOURCE_VALUE": T.StringType(),
},
"note_nlp": {
"NOTE_NLP_ID": T.StringType(),
"NOTE_ID": T.StringType(),
"SECTION_CONCEPT_ID": T.IntegerType(),
"SNIPPET": T.StringType(),
"OFFSET": T.StringType(),
"LEXICAL_VARIANT": T.StringType(),
"NOTE_NLP_CONCEPT_ID": T.IntegerType(),
"NOTE_NLP_SOURCE_CONCEPT_ID": T.IntegerType(),
"NLP_SYSTEM": T.StringType(),
"NLP_DATE": T.DateType(),
"NLP_DATETIME": T.TimestampType(),
"TERM_EXISTS": T.BooleanType(),
"TERM_TEMPORAL": T.StringType(),
"TERM_MODIFIERS": T.StringType(),
},
and update step4 id generation code: step04_domain_mapping/note.sql step04_domain_mapping/note_nlp.sql
Some sites are using string data type for note ids and note_nlp ids. So when we convert the string type to long data type the ids become null and we loose all the data from the site. We would need to support string type at the local schema level and do the conversion to long type during the domain mapping step. During the mapping type use the string ids to build out the new N3C ids for NOTE and NOTE_NLP domain.
Note, if the sites are not submitting New NOTE datasets, be sure to use the cached datasets.