hrvg / wateReview

Computational literature review of water resources research in Latin America and the Caribbean.
https://hrvg.github.io/wateReview
Other
2 stars 0 forks source link

The number of pdfs file do not match the number of rows for each database. #2

Closed hrvg closed 5 years ago

hrvg commented 5 years ago

There is a mismatch between the number of PDFs and the number of entry in the databases. There is probably some duplicates on both sides, that is duplicate entry in the database (e.g. articles with the same title under slight variations of the special characters) and in the number of PDFs.

PDFs files and database entry should ultimately be perfectly aligned.

hrvg commented 5 years ago

The code for counting the paper is in ./R/utils/get_emails.R

hrvg commented 5 years ago

I think that this is linked to the duplicated pdfs. Closing this now.