The API publishers are harvested by date. Different from SFTP source publishers, there are many variations of bunches of articles.
Dilemma: how we should force harvest records? One by one by using DOIs? APS and Hindawi APIs have the option to harvest one article by using DOI, but not a group of dois.
How we should reprocess them if they are already downloaded?
Now, we are saving the original response from API in a file (xml or json), later splitting into separated records and sending it for processing files DAG. The separate records ARE NOT saved in individual files.
The API publishers are harvested by date. Different from SFTP source publishers, there are many variations of bunches of articles. Dilemma: how we should force harvest records? One by one by using DOIs? APS and Hindawi APIs have the option to harvest one article by using DOI, but not a group of dois. How we should reprocess them if they are already downloaded?
Now, we are saving the original response from API in a file (xml or json), later splitting into separated records and sending it for processing files DAG. The separate records ARE NOT saved in individual files.