Hindawi improvement: stage 1

Wanted functionality: to re-harvest and reprocess already downloaded articles in the error situations/code changes...

[x] Add the functionality to harvest articles by DOIs. Example, how to retrieve one article: https://www.hindawi.com/oai-pmh/oai.aspx?verb=getrecord&identifier=oai:hindawi.com:10.1155/2011/391971&metadataprefix=oai_dc
[x] ~~Not reprocess articles, if they are already in the repo~~ (moved to a different task: requires discussion )
[x] ~~Add functionality to reprocess already downloaded files, by triggering the 'trigger_files_processing' task with s3 keys which are responding to articles~~ moved to a different task: requires discussion )
[x] Hindawi wrongly parses the affiliation country for USA, but not for different countries
[x] ~publication_info - new workflows don't have the publication_info.page_start, but we have in heprawl (however, not in other publishers in the workflow). Cannot find the exact place in code, look like it is taken from <article-id pub-id-type="publisher-id"> nad where is a pubnote?~ (moved to different https://github.com/cern-sis/issues-scoap3/issues/124)
[x] copyright.year string or int? In some publishers (such as APS) we have int, in Hindawi we have a string.
[x] no collections
[x] ~do we need raw_name in authors?~ (moved to different issue)

cern-sis / issues-scoap3