DOAJ / doaj

The Directory of Open Access Journals - website and directory software
Apache License 2.0
59 stars 16 forks source link

Evidence that matching doesn't always work for article uploads #830

Closed dommitchell closed 7 years ago

dommitchell commented 9 years ago

https://doaj.org/toc/2166-6482/29/4 https://doaj.org/toc/2166-6482/29/3 https://doaj.org/toc/2166-6482/29/2 https://doaj.org/toc/2166-6482/29/1

In #711 @richard-jones says: "there's no doubt that the matching algorithm is working correctly - I was able to do a mock match for one of these and found the other." https://github.com/DOAJ/doaj/issues/711#issuecomment-127292406 and yet we can see that there are circumstances when this does not work correctly.

In this particular example, the original XML uploads were affected by the recent upload stoppage so perhaps something there is causing this. Thoughts?

richard-jones commented 7 years ago

Ok, this issue has got too long and complex. It has been replaced by the following 3 issues:

1294 - a data cleanup task that can be done during that work

1295 - an enhancement to the data ingest validation to reduce the chances of unintended duplication

1296 - a script which will allow us to report on the nature of duplication in the system at any point