Currently, when both the PMID and the DOI are available in a dataset, the enrich.py script will first try to find the OpenAlexID based on PMID.
It will overwrite the current PMID and DOI of the records with a PMID, even if no OpenAlexID (and thus no DOI) was found.
This results in an empty DOI, which will no longer be searched on.
Given that in all following steps we only use the OpenAlexID, I suggest we simply do not overwrite the PMID and DOI we retrieve from OpenAlex. This solves the problem and has the benefit that we maintain the ID that was present in the original data and was searched on.
Currently, when both the PMID and the DOI are available in a dataset, the enrich.py script will first try to find the OpenAlexID based on PMID. It will overwrite the current PMID and DOI of the records with a PMID, even if no OpenAlexID (and thus no DOI) was found. This results in an empty DOI, which will no longer be searched on.
Given that in all following steps we only use the OpenAlexID, I suggest we simply do not overwrite the PMID and DOI we retrieve from OpenAlex. This solves the problem and has the benefit that we maintain the ID that was present in the original data and was searched on.