cern-sis / issues-scoap3

0 stars 0 forks source link

3 Hindawi articles not in the repo. #276

Open agentilb opened 5 months ago

agentilb commented 5 months ago

Is it possible to check if those articles are somewhere in the repo: 10.1155/2023/8127604 10.1155/2023/6922729 10.1155/2023/6614276

I cannot find them, but it seems they are SCOAP3 articles.

Thanks in advance!

ErnestaP commented 5 months ago

I managed to harvest 10.1155/2023/8127604 However, 2 others I was not able to harvest, they are not in the error state either, I will continue checking

agentilb commented 5 months ago

Any news about the 2 other articles?

ErnestaP commented 5 months ago

I managed to find them: https://www.hindawi.com/oai-pmh/oai.aspx?verb=listrecords&metadataprefix=oai_dc&from=2023-08-07 https://www.hindawi.com/oai-pmh/oai.aspx?verb=listrecords&metadataprefix=oai_dc&from=2023-06-15 But I am struggling to harvest them

for developers: I am not sure how to trigger the harvester, I need to pass metadataprefix for the trigger, snippet:

  from flask_celeryext.app import current_celery_app
  def run_beat_task(task_name):
       task = app.config["CELERYBEAT_SCHEDULE"][task_name]
       task["kwargs"]["from_date"]= "2023-06-15"
       task["kwargs"]["until_date"]= "2023-06-15"
       task["kwargs"]["metadata_prefix"]= "oai_dc"
       current_celery_app.send_task(task["task"], kwargs=task["kwargs"])
  run_beat_task('oai-harvest-hindawi')
ErnestaP commented 5 months ago

We found why the articles are not in the repo: they are not in the set we are harvesting. We are harvesting articles with the following parameters:

set=HINDAWI.AHEP metadataprefix=marc21 from=needed date until=needed date

when we are harvesting articles the URL looks like this (for article 10.1155/2023/6922729): https://www.hindawi.com/oai-pmh/oai.aspx?verb=listrecords&metadataprefix=marc21&set=HINDAWI.AHEP&from=2023-06-15&until=2023-06-15

However, API returns 0 results, that's why we cannot harvest them - there articles are not in the set we have to harvest

agentilb commented 5 months ago

Hi @ErnestaP It seems Hindawi has done something with the 2 articles. Could you try to re-harvest them? Thanks!

ErnestaP commented 5 months ago

Hi @agentilb, sorry maybe I missed an email, what did they exactly do? I tried to reharvest articles, unfortunately not successfully, by the same dates because as I see from the records, the timestamps are remained the same

ErnestaP commented 5 months ago

Tried to harvest them again, and it worked. Both articles are in the repo: https://repo.scoap3.org/records/82917 https://repo.scoap3.org/records/82916

ErnestaP commented 5 months ago

@agentilb can we close the issue? :)