Closed clarisse-lau closed 5 months ago
This may be because the check is only being applied to primary IDs and not 'parent' IDs?
Also one more note, the above regex works for HTAN Data File IDs and HTAN Biospecimen IDs (eg HTA11_120_1211
), but not participant IDs (eg HTA11_120
)
Updated HTAN ID regex rules (distinct for file, biospecimen, and participant IDs) can be found here: https://github.com/ncihtan/data-models/issues/268#issue-1808171944
This is now fixed and deployed. I relaxed the leading zero filter.
During the release process, our internal release scripts picked up on some HTAN IDs that did not match the HTAN ID Format SOP. However, these errors were not listed on hdash in the
Primary IDs follow the HTAN ID Spec
section.For example, WashU had submitted parent biospecimen IDs such as
CE336E1-S1
,HT128B1-S1H4
, andP5296-1N2
) (These have now been resolved, but see example non-conforming IDs in the previous version (v6) of the snATAC-Seq_level_1_atac_tumor manifest https://www.synapse.org/#!Synapse:syn52257214.6 )Currently the regex rule used for ID validation is
^(HTA([1-9]|1[0-5]))_((EXT)?([0-9]\d*|0000))_([0-9]\d*|0000)$
.