lubianat / pyorcidator

MIT License
9 stars 5 forks source link

get_organization_list not returning full list of organizations #24

Closed jvfe closed 2 years ago

jvfe commented 2 years ago

In the branch tests/get_org_list_bug I built a simple test to check if helper.get_organization_list could return a list of all organizations in the sample data. However, it currently only returns 2 of the 4 organizations (Output of pytest -v):

E       AssertionError: assert ['Harvard Med...l', 'Q152171'] == ['Harvard Med...unhofer SCAI']
E         At index 1 diff: 'Q152171' != 'Enveda Biosciences'
E         Right contains 2 more items, first extra item: 'Q152171'
E         Full diff:
E         - ['Harvard Medical School', 'Enveda Biosciences', 'Q152171', 'Fraunhofer SCAI']
E         + ['Harvard Medical School', 'Q152171']

As you can see, the function only returns the first and the third employment entries. I believe this piece of code is the reason:

if a["disambiguated-organization"] is None:
    continue

If it doesn't find a key for disambiguated-organization - which is the case for the second and fourth entries I showed above -, it jumps to the next entry in the list, instead of returning it at the end. So, what's the reason behind this line? Is there something I'm missing here? Thanks

lubianat commented 2 years ago

Hm, I am not sure what is happening there, I'll have to check. I don't think there was a specific reason, maybe an untested corner case. Feel free to remove/change.