FlamingTempura / bibtex-tidy

Cleaner and Formatter for BibTeX files
https://flamingtempura.github.io/bibtex-tidy/
MIT License
859 stars 66 forks source link

Duplicates missed when formatting is different #424

Closed am-thyst closed 6 months ago

am-thyst commented 8 months ago

I have taken 2 duplicate entries from my bibliography and adapted them to make them anonymous.

They clearly refer to the same article but seem to be missed on "Check for duplicates: matching keys, matching DOIs, similar author and title". I assume the author formatting is throwing it off, despite having same title, year, volume, journal, etc.

Here's the reproducible example:

@article{Doe2014,
    title        = {{Example Article to Demonstrate Bug}},
    author       = {J. A. Doe and A. B. C. Johnson and D. E. Martin},
    year         = 2014,
    journal      = {Example Review Journal},
    publisher    = {American Example Society},
    address      = {Boston MA, USA},
    volume       = 142,
    number       = 4,
    pages        = {1234--1256},
    doi          = {https://doi.org/11.1111/EXA-M-11-11111.1},
    url          = {https://journals.example.org/view/journals/example/123/1/example-100.1.xml}
}

@article{doe2014example,
    title        = {{Example article to demonstrate bug}},
    author       = {Doe, JA and Johnson, ABC and Martin, DE},
    year         = 2014,
    journal      = {Example Review Journal},
    publisher    = {American Example Society},
    volume       = 142,
    number       = 4,
    pages        = {1234--1256},
}

The issue is fixed if you add a DOI for the second entry, as you would expect.

FlamingTempura commented 6 months ago

This should be fixed in the latest version (1.13.0).

The similarity check uses the last name of the first author, the title, and the number. The code to get the last name has been improved.

Tested with your example and it will now merge into a single entry.