lit-regensburg / samshee

A schema-agnostic parser and writer for illumina® sample sheets v2 and similar documents.
MIT License
5 stars 2 forks source link

schema to enforce unique Sample_ID #3

Open britnyblu opened 2 weeks ago

britnyblu commented 2 weeks ago

Hi Great package first of all! I'd like to enforce unique Sample_IDs. I see that you wrote a comment suggesting this. Is there a way I can enforce this via a custom json schema? Or does it need to be added directly to the code?

j4cko commented 2 weeks ago

Hi, as far as I understand, this is currently not possible with json-schema, but you can write a custom validation function to make it work:

def unique_sampleids(doc: SectionedSheet) -> None:
    sampleids = [e['Sample_ID'] for e in doc['BCLConvert_Data']]
    if len(sampleids) != len(set(sampleids)):
        raise Exception("\"Sample_ID\"s are not unique.")

samshee.validation.validate(doc, unique_sampleids)

(see the examples in the Validation section of the README)

Hope this helps...?