AlexsLemonade / refinebio

Refine.bio harmonizes petabytes of publicly available biological data into ready-to-use datasets for cancer researchers and AI/ML scientists.
https://www.refine.bio/
Other
128 stars 19 forks source link

Update `validate_dataset` to allow for experiments with duplicate samples #3131

Closed arkid15r closed 1 year ago

arkid15r commented 1 year ago

Issue Number

Resolves #3055

Purpose/Implementation Notes

Change some checks to more readable versions. Use sets instead of lists for entity IDs tracking (accessions, experiments). Get rid of str + str.

Types of changes

Functional tests

Checklist