pcubillos / bibmanager

A BibTeX manager for LaTeX projects
https://bibmanager.rtfd.io
MIT License
62 stars 13 forks source link

Bug: Regression from #90 with False duplicates detected for Incollection book entries #108

Closed emirkmo closed 2 years ago

emirkmo commented 2 years ago

Issue #89 resolved by PR #90 dealt with false duplicate detections, which was merged into version 1.3.4. I am running version 1.4.5 installed via conda-forge and I am running into the same issue.

Here is a false double detection. The ISBNs are the same, but the DOIs are different.

What's expected here is that bibmanager does not complain during a merge with these two entries, or other similar entries.

Note: This is possibly a DOI parsing issue as these DOIs have some Latex escape sequences in them.

@incollection{OConnor2017,
    author = "O’Connor, Evan",
    editor = "Alsabti, A. W. and Murdin, P",
    title = "{The Core-Collapse Supernova-Black Hole Connection}",
    year = "2016",
    booktitle = "Handbook of Supernovae",
    pages = "1--18",
    publisher = "Springer",
    url = "https://doi.org/10.1007/978-3-319-20794-0\_129-1 http://link.springer.com/10.1007/978-3-319-20794-0\_129-1",
    address = "Cham",
    isbn = "9783319218465",
    doi = "10.1007/978-3-319-20794-0{\\_}129-1"
}

NEW:
@incollection{Alsabti2016,
    title = {{Supernovae and Supernova Remnants: The Big Picture in Low Resolution}},
    year = {2017},
    booktitle = {Handbook of Supernovae},
    author = {Alsabti, Athem W. and Murdin, Paul},
    editor = {Alsabti, A.~W. and Murdin, P},
    pages = {3--28},
    publisher = {Springer, Cham},
    isbn = {9783319218465},
    doi = {10.1007/978-3-319-21846-5{\_}1},
    keywords = {Physics}
emirkmo commented 2 years ago

Ok I found where it happens, this is because duplicate ISBN from books are not filtered for a bibm merge

https://github.com/pcubillos/bibmanager/blob/eaf1d351fb3f4de79c14bfd84c2223d74220e15c/bibmanager/bib_manager/bib_manager.py#L943

Calls filter_field which does not have the same ISBN duplicate separate DOI check as in the remove_duplicates function from the same module, with the logic at:

https://github.com/pcubillos/bibmanager/blob/eaf1d351fb3f4de79c14bfd84c2223d74220e15c/bibmanager/bib_manager/bib_manager.py#L555-L568

I suggest separating the logic for checking ISBN duplicates that don't share a DOI (as in a book), and then calling that in filter_field if the field argument is 'isbn'.

and the filter_field function to modify: https://github.com/pcubillos/bibmanager/blob/eaf1d351fb3f4de79c14bfd84c2223d74220e15c/bibmanager/bib_manager/bib_manager.py#L587-L628

emirkmo commented 2 years ago

Probably also should add a bibm merge to the tests that contains a duplicate ISBN with separate DOIs :)

pcubillos commented 2 years ago

Hi Emir, thanks a lot for the detailed report! Version 1.4.6 should fix this issue (pip is updated, conda should take a few hours to get up to date). Please take a look when you have time and let me know whether things are running ok.