hubmapconsortium / metadata-consistency

1 stars 0 forks source link

How can an end-user tell if a dataset is derived or lab processed? #5

Open icaoberg opened 1 year ago

icaoberg commented 1 year ago

What is the query if any? What end-point exists for this inquiry?

Bottom line is: is there any field or combination of fields that we can use to tell them apart?

MariahKenney commented 1 year ago

Previously, we had discussed using what is below. @icaoberg what were the reasons it did not work for you?

def is_primary( hubmap_id, instance='prod', token=None ):
    metadata = hubmapbags.apis.get_ancestors_info( hubmap_id, instance=instance, token=token )
    if 'entity_type' in metadata[0].keys() and  metadata[0]['entity_type'] == 'Sample':
        return True
    else:
        return False

print(is_primary(hmid, instance='prod', token=token))
icaoberg commented 1 year ago

@MariahKenney I can label a dataset as primary or derived. But when it is labeled as derived, I should be able to add another label as derived datasets (datasets created by CMU team) or lab processed datasets (datasets submitted by data providers).