Closed brant-ruan closed 7 months ago
Hi, I cannot reproduce this issue. Can you provide the error notification?
I can import this paper to my lib, but the metadata is wrong. The reason is that this paper used a wrong DOI https://doi.org/10.1145/nnnnnnn.nnnnnnn
Hi, I cannot reproduce this issue. Can you provide the error notification?
I can import this paper to my lib, but the metadata is wrong. The reason is that this paper used a wrong DOI https://doi.org/10.1145/nnnnnnn.nnnnnnn
Thanks for pointing out the issue.
I use advanced search and find another paper with https://doi.org/10.1145/nnnnnnn.nnnnnnn
in the paper. Seems this phenomenon is not common, but if some published papers didn't update this DOI code, paperlib will consider them as the same paper, which is "TestRank: Bringing Order into Unlabeled Test Instances for Deep Learning Tasks" (NeurIPS).
For this special case, I can not modify the DOI and run scraping, as paperlib will still get the DOI from the paper and fetch information with it.
Currently, please manually edit the metadata of papers with such DOIs.
Tomorrow I have a conference deadline. After that, I will investigate and fix this issue ASAP.
Thanks. Wishing you all the best for your paper's acceptance at the conference :-)
@brant-ruan Hi, this issue has been fixed now.
I implemented an invalid doi checking process for the metadata server. However, we can only get the title and author list of this paper currently. I found that this is a very recent publication. No database records this paper until now.
For conference papers, it's common that we need to wait at least half to one year before those databases record them. I usually collect the recently accepted papers in my own research field and insert them into the metadata server database manually. But I cannot do that for all research fields.
I'm thinking, maybe creating a GitHub repo to store some lists of publications and letting the metadata server connect to this repo is a good idea. Let users submit a list of papers and create a pull request should be acceptable.
Best wishes.
@brant-ruan Hi, this issue has been fixed now.
I implemented an invalid doi checking process for the metadata server. However, we can only get the title and author list of this paper currently. I found that this is a very recent publication. No database records this paper until now.
For conference papers, it's common that we need to wait at least half to one year before those databases record them. I usually collect the recently accepted papers in my own research field and insert them into the metadata server database manually. But I cannot do that for all research fields.
I'm thinking, maybe creating a GitHub repo to store some lists of publications and letting the metadata server connect to this repo is a good idea. Let users submit a list of papers and create a pull request should be acceptable.
Best wishes.
Agree.
There is a common situation (at least for me) when I search for papers with search engine and get two download sources: 1) the publication database 2) the author's academic home page or the institution's page. The second one sometimes provides pre-publication versions (or something like that) without further updating and valid DOI. As the contents from both sources are usually identical, and the second source often becomes available earlier than the database, I will download from it.
The GitHub repo idea is great. I am very glad to contribute to the information security field.
Hi @brant-ruan, I've created a GitHub repository for the community to contribute to the metadata database.
https://github.com/Future-Scholars/paperlib-community-metadata-collection
Just create a json
file containing the metadata you want to introduce and raise a PR to this repo.
Once the PR is merged, the data in the json
will be inserted to our metadata database.
By doing so, you can scrape corresponding metadata in Paperlib.
Great! Thanks @GeoffreyChen777 , I will check it after the submission deadline :-)
Describe the bug A clear and concise description of what the bug is.
Occasionally I find some papers (in PDF format) cannot be imported into paperlib, even if it can be opened in PDF readers. Sometimes paperlib may report invalid PDF log at the bottom on the left, but not always.
For example, the paper below cannot be imported (either by downloading it and importing manually or by importing with the chrome plugin):
https://mengrj.github.io/files/CCS23.pdf
To Reproduce Steps to reproduce the behavior:
System (please complete the following information):