Open koppor opened 1 year ago
WARN: abbreviation is the same as the full text
When journal name is only one word,its abbreviation is the same as the full name.
e.g. full name: Fuel
, its abbrev is Fuel
.
Hi, I would like to tackle this issue with my group : )
@northword I think, the expected result is a Python tool residing in https://github.com/JabRef/abbrv.jabref.org/tree/main/scripts. It should print out issues and exit with failure code if issues are found. -- You can chose another programming language of you want.
Example output of lychee, which has another purpose, but also outputs check results:
(Source: https://github.com/JabRef/jabref/actions/runs/11361716475)
Hey, when implementing the check logic for 'WARN: abbreviation is the same as the full text,' should we only give a warning if the journal's name has more than one word and the abbreviation is the same as its full name? If the journal name is just one word, as @northword mentioned, should we simply pass it?
Hey, when implementing the check logic for 'WARN: abbreviation is the same as the full text,' should we only give a warning if the journal's name has more than one word and the abbreviation is the same as its full name? If the journal name is just one word, as @northword mentioned, should we simply pass it?
Yes.
My current function that checks the validity of starting letters of abbreviations considers the below entries as invalid, because the starting letters of the abbreviations do not match well with the full names.
Full: 'Polish Academy of Sciences', Abbrev: 'Acta Phys. Polon. A' Full: 'Jagellonian University', Abbrev: 'Acta Phys. Polon. B' Full: 'Universităţii din Timișoara', Abbrev: 'An. Univ. Timișoara Ser. Mat.-Inform.' Full: 'Universităţii "Ovidius" Constanţa', Abbrev: 'An. Ştiinţ. Univ. Ovidius Constanţa Ser. Mat.'
However, these abbreviations seem to be legitimate for the corresponding full names, though not being obvious. Could you provide some idea how I should refine the criteria of invalidity?
Maybe a hard coded list of exceptions? 😅
Not sure how many there are to be hardcoded : ( I might try using some similarity threshold to check them. That way abbreviations that are legitimate but are too different from the original full names would fail the check. Does that work?
Not sure how many there are to be hardcoded : ( I might try using some similarity threshold to check them. That way abbreviations that are legitimate but are too different from the original full names would fail the check. Does that work?
I haven't tried.
Maybe test cases need to be generated.
Maybe warnings can be output. Then an exception file generated by the user. Similar to .lycheeignore for the link checker lychee.
Obe might aslo output a number stating the distance.
For manual lists, this is helpful.
For downloaded lists, reports could be made.
I think, there are bugs in the lists.
I needed to fix lists, because "wrong" lists were in. See https://github.com/JabRef/abbrv.jabref.org/pull/148
We should have a checker. Following are the tasks it should check:
ERROR: Wrong escape
ERROR: Wrong beginning letters
(This is https://github.com/JabRef/abbrv.jabref.org/issues/107)
ERROR: List contains non-UTF8 characters
This is https://github.com/JabRef/abbrv.jabref.org/issues/125.
WARN: Double entries
(This refs https://github.com/JabRef/abbrv.jabref.org/issues/77)
WARN: Same full form appearing twice
(This refs https://github.com/JabRef/abbrv.jabref.org/issues/77)
WARN: Same abbrevation appearing twice
(This refs https://github.com/JabRef/abbrv.jabref.org/issues/77)
WARN: abbreviation is the same as the full text
WARN: Management is abbreviated with outdated "Manage." instead of "Manag.
This is https://github.com/JabRef/abbrv.jabref.org/issues/78