EBISPOT / goci

GWAS Catalog Ontology and Curation Infrastructure
Apache License 2.0
26 stars 19 forks source link

PMID not getting imported from EPMC in depo-curation app #1398

Closed Santhi1901 closed 1 month ago

Santhi1901 commented 2 months ago

PMID: 39046104 is not getting imported to the depo-curation app.

Also noted that some of the pmids from previous weeks were not imported when added in bulk.

For example, at https://www.ebi.ac.uk/gwas/depo-curation/publications/66a372872571d9000176c959, the tracking info shows: Pmid-39002897, 39023044, 39011893,39002029, 39022727... were created on 2024-07-26T09:55:23.286Z, but 39023044, 39011893, 39022727, 39024449, 39013416, and 39019884 did not get imported.

I again tried importing those missing PMIDs today, and it all got added except 39046104 which is not getting imported.

some of the pmids are missing the event tracking info like https://www.ebi.ac.uk/gwas/depo-curation/publications/66b9e1a72571d9000176ce68.

Santhi1901 commented 2 months ago

I added these on 12th July to the depo-curation db, and now none of them exists in the depo-curation. 38970458 38966903 38969199 38967582 38951977 38960375 38957036 38958042 38945087 38951484 38973605 38980841 38970920 38997780 39000095 38982111 38978726 38982179 38980270 38971891 39023044 39011893 39022727 39024449 39013416 39019884

sajo-ebi commented 2 months ago

@Santhi1901 we identified an issue the Depo Sync had a logic to delete Publications without studies in Mongo side , Since we moved the EuropePMC import to Depo Curation we are no longer creating default study in DB , hence the Depo Sync deletes the new publications, have deployed the fix & we will need to manually import the missing Pmids again, Please report me any error so that I can check the logs immediately

Santhi1901 commented 2 months ago

The pmids have been added, but the status and curator are empty. On the publication page, for eg: https://www.ebi.ac.uk/gwas/depo-curation/publications/66bd1b5744e4cf0001cb18a4, everything is empty

Santhi1901 commented 2 months ago

This is the list of PMIDs which are blank in depo-curation app: 38970458 38966903 38969199 38967582 38951977 38960375 38957036 38958042 38945087 38951484 38973605 38980841 38970920 38997780 39000095 38982111 38978726 38982179 38980270 38971891 39022727 39054468 39039282 39061961 39038097 39046104 39057616 39062192 39059088 39057031 39045004 39040048 39056224 39060932 39062709 39050255 39090729 39093873 39072000 39079175 39094715 39086896 39077866 39085219 39091897 39079071

sajo-ebi commented 2 months ago

@Santhi1901 the Pmids have been deleted & can imported from EuropePmc

Santhi1901 commented 2 months ago

I added it as 3 batches; in one of the batches, the import result was not present, and 2 pmids had problems with that: 39046104: which didn't get added 39045004: blank details The rest of the pmids in that batch are on the publication page and look fine. The other batches didn't have any problem, and all pmids are present on the publication page with the required details

SImport_result_not_present.png
Santhi1901 commented 2 months ago

39045004 has been added successfully 39046104 got an unexpected response from EuropePMC. The error handling is not proper for this unexpected response from EuropePMC, and the import result is not displayed. Sajo mentioned changing the code to deal with the error.

ljwh2 commented 2 months ago

@ala-ebi to investigate unexpected response from EuropePMC

ala-ebi commented 2 months ago

@Santhi1901 @ljwh2 I have tried importing the PMID 39046104 to debug the above error, but I got a message saying PMID already exists, which was true apparently. However, I don't see it in the tracking.

Another separate bug is that the event tracking is creating events for each attempt to import a PMID even when the PMID exists. So I saw an entry in the event tracking saying "PMID created" even when I imported and got "PMID already exists" message. I tried this in Sandbox with a different PMID and it's the same case, keeps creating events each time I click import on the same PMID.

sajo-ebi commented 1 month ago

@sajo-ebi has fixed this in sandbox will push to Prod as part of curation reporting release