nasa / GeneLab_Data_Processing

62 stars 42 forks source link

[BulkRNASeq] V&V program mistypes samplenames as ints when possible #33

Open J-81 opened 1 year ago

J-81 commented 1 year ago

Description

When sample names can be interpreted as numerical instead of string (e.g. 12), certain V&V checks incorrectly do so result in failure to match '1' and 1.

Approaches

Samplename column should also be read in as datatype string.

Implementation Suggested

All runsheet loading should use a standard interface that interprets samplename as string datatype.

Validation Plan

GLDS-201 triggers the error and can serve to as a good test case.

Impact

No impact on prior data since error causes a false workflow halt.