winkm89 / teachPress

Official repository of teachPress (publication management plugin for WordPress)
GNU General Public License v2.0
55 stars 44 forks source link

overwrite repeated entries #101

Closed diemort closed 3 years ago

diemort commented 6 years ago

I have a publication list that is going to be updated along the year. The easiest way to do that is to upload regularly a new bibtex list, which will include all entries of a given year.

Let's say I have 10 publications in 2018 already listed in my site. I have noticed that if I upload a new bibtex list with one new entry (10 + 1 entry), I'll get a database containing the new entry plus 10 duplicated entries.

It would be nice to make the plugin to check for possible duplicated entries (in this case, 10) and add to the database only the new ones.

winkm89 commented 6 years ago

Hi,

this function is already availabe but hidden by default. But you can activate it: overwrite

overwriting2

diemort commented 6 years ago

Dear Michael,

Thank you for the confirmation. I wasn't sure that option meant to overwrite duplicated entries.

I've tried to upload the same .bib twice, but I still get duplicated entries. So I assume it is really an experimental feature.

Are there plans to improve it? It'd be very convenient for many users.

winkm89 commented 6 years ago

Hi,

Sorry for the late answer. If you could post me a part of your .bib file (one or two entries which were duplicated with the import), then I could try to find the error in the function.

diemort commented 6 years ago

Dear Michael,

Thank you for your availability. Here is a piece of the BIB file with a few publications.

Let me know in case you need more information from my side.

Cheers,

--Gustavo

On 18 July 2018 at 19:04, Michael notifications@github.com wrote:

Hi,

Sorry for the late answer. If you could post me a part of your .bib file (one or two entries which were duplicated with the import), then I could try to find the error in the function.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/winkm89/teachPress/issues/101#issuecomment-406088954, or mute the thread https://github.com/notifications/unsubscribe-auth/AFfQvtcAG2VmSoot4x30Ww_m_ru9w60sks5uH7D0gaJpZM4UQizD .

shahab-ab commented 3 years ago

Hello, Thank you for your great plugin. There is still an issue with overwrite option. I tried to import the same file.. some entries are still duplicated with same title and authors,... I think instead of overwriting of the same record, maybe a record above or below is being replaced. This is a very useful feature if it works without bugs.

to be more clear in my case first import bring 311 entries. 2 entries is added for every import of the exact file! Thanks for your time and huge effort.

winkm89 commented 3 years ago

Hi, The overwrite option compares only the bibtex key and not a title/author combination. So I think that the duplicates have different bibtex keys?

shahab-ab commented 3 years ago

Hi, Thank you Michael for answering this issue. Correct - but I try with the same file for two times- the number of publications increased by 2 per every import.

winkm89 commented 3 years ago

Are these always the same entries that are duplicated?

shahab-ab commented 3 years ago

Are these always the same entries that are duplicated?

I am sure for only the Titles. I will check and give you a feedback soon.

shahab-ab commented 3 years ago

UPDATE: Here are two examples: - First Import brought 312 Entries which is correct. Second run with same BibTex file brought 314 entries - increased by 2. It lists the same entry under under a different Pub-Type:

image

image

winkm89 commented 3 years ago

Could you post the original bibtex code of this 2 entries?

shahab-ab commented 3 years ago

Although there are 4 entries with the same title, at first try TP inserts only one of one of the similar entries. But in second run inserts the two others into publications. ( And the Tags table is also left empty without entries. (for all of entries)).

@inproceedings{DBLP:conf/cikm/Mulang0PNH020, author = {Isaiah Onando Mulang and Kuldeep Singh and Chaitali Prabhu and Abhishek Nadgeri and Johannes Hoffart and Jens Lehmann}, editor = {Mathieu d'Aquin and Stefan Dietze and Claudia Hauff and Edward Curry and Philippe Cudr{\'{e}}{-}Mauroux}, title = {Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation Models}, booktitle = {{CIKM} '20: The 29th {ACM} International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020}, pages = {2157--2160}, publisher = {{ACM}}, year = {2020}, url = {https://doi.org/10.1145/3340531.3412159}, doi = {10.1145/3340531.3412159}, timestamp = {Fri, 25 Dec 2020 01:15:14 +0100}, biburl = {https://dblp.org/rec/conf/cikm/Mulang0PNH020.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

@article{DBLP:journals/corr/abs-2008-05190, author = {Isaiah Onando Mulang and Kuldeep Singh and Chaitali Prabhu and Abhishek Nadgeri and Johannes Hoffart and Jens Lehmann}, title = {Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation Models}, journal = {CoRR}, volume = {abs/2008.05190}, year = {2020}, url = {https://arxiv.org/abs/2008.05190}, archivePrefix = {arXiv}, eprint = {2008.05190}, timestamp = {Sun, 16 Aug 2020 01:00:00 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2008-05190.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }


@inproceedings{DBLP:conf/coling/XuNCL20, author = {Chengjin Xu and Mojtaba Nayyeri and Yung{-}Yu Chen and Jens Lehmann}, editor = {Donia Scott and N{\'{u}}ria Bel and Chengqing Zong}, title = {Knowledge Graph Embeddings in Geometric Algebras}, booktitle = {Proceedings of the 28th International Conference on Computational Linguistics, {COLING} 2020, Barcelona, Spain (Online), December 8-13, 2020}, pages = {530--544}, publisher = {International Committee on Computational Linguistics}, year = {2020}, url = {https://doi.org/10.18653/v1/2020.coling-main.46}, doi = {10.18653/v1/2020.coling-main.46}, timestamp = {Fri, 08 Jan 2021 00:00:00 +0100}, biburl = {https://dblp.org/rec/conf/coling/XuNCL20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

@article{DBLP:journals/corr/abs-2010-00989, author = {Chengjin Xu and Mojtaba Nayyeri and Yung{-}Yu Chen and Jens Lehmann}, title = {Knowledge Graph Embeddings in Geometric Algebras}, journal = {CoRR}, volume = {abs/2010.00989}, year = {2020}, url = {https://arxiv.org/abs/2010.00989}, archivePrefix = {arXiv}, eprint = {2010.00989}, timestamp = {Mon, 12 Oct 2020 01:00:00 +0200}, biburl = {https://dblp.org/rec/journals/corr/abs-2010-00989.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

shahab-ab commented 3 years ago

The complete file: https://dblp.org/pid/71/4882.bib?param=1

winkm89 commented 3 years ago

Thank you for the examples. I think I have it. It's a small nasty bug in the method _TP_Publications::generate_unique_bibtexkey(). This method includes a test, if the bibtex key exists, before the publication will be imported.

The problem was that the check was to weak. If you have for example a publication with the bibtex key "123" and another with "123a" then the method checks for a key like "%123%" which includes also "123a". And that was the problem.

I've released teachPress 7.1.5 over wordpress.org, which contains the bugfix.

shahab-ab commented 3 years ago

Thank you Michael for your availability and very helpful guides on every issue.

Awesome! Then the issue is now solved. To the UPDATE...

Cheers,

shahab-ab commented 3 years ago

HI Michael,

Problem Description: Since non of the entries from DBLP BIbtex file have tag within (Keyword), I have to add them manually for over 300 items. The problem is that by every update any customizations of a entry like adding a custom PDF link and adding tags are reset and I have to reapply them all. How is it possible to prevent this.

Regards

winkm89 commented 3 years ago

You can currently only comment this function out in the source code. I think i will add an import option for that.