koppor / jabref

Collection of simple for JabRef issues. Please submit PRs to https://github.com/jabRef/jabref/.
https://github.com/jabRef/jabref/
MIT License
8 stars 15 forks source link

Make abbreviations also working for conferences #360

Open koppor opened 5 years ago

koppor commented 5 years ago

Abbreviations are listed at https://github.com/JabRef/abbrv.jabref.org. The system is currently not used for conference proceedings (field booktitle)

Support abbrevations also for "booktitle" (data model, GUI) (This is https://github.com/JabRef/jabref/issues/1285)

koppor commented 3 years ago

Example based on: https://dblp.uni-trier.de/rec/conf/bpm/Rinderle-MaM21.html?view=bibtex&param=1 (condensed, standard)

Before shortening:

@inproceedings{DBLP:conf/bpm/Rinderle-MaM21,
  author    = {Stefanie Rinderle{-}Ma and
               Juergen Mangler},
  title     = {Process Automation and Process Mining in Manufacturing},
  booktitle = {International Conference on Business Process Management},
  series    = {Lecture Notes in Computer Science},
  volume    = {12875},
  pages     = {3--14},
  publisher = {Springer},
  year      = {2021}
}

After shortening:

@inproceedings{DBLP:conf/bpm/Rinderle-MaM21,
  author    = {Stefanie Rinderle{-}Ma and
               Juergen Mangler},
  title     = {Process Automation and Process Mining in Manufacturing},
  booktitle = {BPM},
  series    = {Lecture Notes in Computer Science},
  volume    = {12875},
  pages     = {3--14},
  publisher = {Springer},
  year      = {2021}
}

There will be a CSV file:

International Conference on Business Process Management,BPM

TODOs:

xianghao-wang commented 2 years ago

Hello, could you assign this issue to me? I am really glad to contribute to this project.

xianghao-wang commented 2 years ago

Hello, @koppor would you like to combine the journal abbreviations and book title abbreviations as a single functional part (share the same abbreviation data) or develop the book title abbreviation as an independent functional part (independent database and independent module)?

koppor commented 2 years ago

Hello, @koppor would you like to combine the journal abbreviations and book title abbreviations as a single functional part (share the same abbreviation data)

The abbreviation data should be different. Instead of "journals" sub directory at https://github.com/JabRef/abbrv.jabref.org, the abbreviations should go into "conferences". Otherwise, the lists won't be maintainable IMHO.

develop the book title abbreviation as an independent functional part (independent database and independent module)?

I think, you can resue the journal code. Some renaming and generalization fo the methods is necessary. I refined the list at https://github.com/koppor/jabref/issues/360#issuecomment-925277746 to focus on the core funtionality. I also added a hint.

grafik

Siedlerchr commented 1 year ago

Booktitle are checked by the Integrity Check action: https://github.com/JabRef/jabref/blob/8536216e1c2bd0b05f3499192e2b75a88e3fc25e/src/main/java/org/jabref/logic/integrity/AbbreviationChecker.java#L14-L17

koppor commented 1 year ago

Currently, that checker uses the Journal Abbreviation Repository. I ask for conference names (and abbreviations), which are different from journal names.

koppor commented 1 year ago

CSV file format

The CSV file will have the format <full name>,<abbreviation>[,<shortest unique abbreviation>]. Thus, it is a subset of https://github.com/JabRef/abbrv.jabref.org/#format-of-the-csv-files (no frequency shown). Currently, JabRef does not handle "frequency". See /src/main/java/org/jabref/logic/journals/Abbreviation.java#L29. Thus, the class org.jabref.logic.journals.Abbreviation can be re-used as is.

Preferences

In "CSV file format", we learned that an Abbreviation object is the same for a conference and a journal. Consequently, List<Abbrevation> is also the same. This leads to the conclusion that JournalAbbreviationPreferences and ConferenceAbbreviationPreferences have a list of abbreviations in common. To have something in common, in object-oriented programming, inheritance is used. Thus: Introduce new abstract AbbreviationPreferences and new ConferenceAbbreviationPreferences (inheriting from AbbreviationPreferences). The JournalAbbreviationPreferences need also inherit from AbbreviationPreferences.

Where to put the externalLists? In the AbbreviationPreferences, because both the conference abbreviation and the journal abbreviations make use of the list.

Where to put useFJournalField? This is very specific to journals. Thus, it has to stay in org.jabref.logic.journals.JournalAbbreviationPreferences.

classDiagram
    AbbreviationPreferences <|-- JournalAbbreviationPreferences
    AbbreviationPreferences <|-- ConferenceAbbreviationPreferences
    <<Abstract>> AbbreviationPreferences

    AbbreviationPreferences : -externalLists
    JournalAbbreviationPreferences : -useFJournalField

Action points:

Repository

The org.jabref.logic.journals.JournalAbbreviationRepository has no journal specifics in (reason: See "CSV file format" above).

Action: Refactor: Rename JournalAbbreviationRepository to AbbreviationRepository.

Loader

The org.jabref.logic.journals.JournalAbbreviationLoader has some journal specifics in.

The journal specifics are journal-list.mv and /journals/journal-list.mv

Two options: A) parameterize the class or B) hide these internals in a class hierarchy. I go for B)

classDiagram
    AbbreviationLoader<|-- JournalAbbreviationLoader
    AbbreviationLoader<|-- ConferenceAbbreviationLoader
    <<Abstract>> AbbreviationLoader

    AbbreviationLoader: +readListFromFile
    AbbreviationLoader: AbbreviationLoader(String mvName)
    AbbreviationLoader: +AbbreviationRepository loadloadRepository(AbbreviationPreferences)
    AbbreviationLoader: -mvName

Just using AbbreviationPreferences works here as fjournal is not needed here.

Using the variable mvName, the variable tempJournalList and the path to JournalAbbreviationRepository.class.getResourceAsStream("/journals/journal-list.mv") can be dynamically be made. The tempDir can be named "jabref-abbreviation-loading" (instead of "jabref-journal")

Constructors

HoussemNasri commented 1 year ago

One problem is that booktitle usually contains the name of the proceedings, which if I understand correctly, is the name of a publication (usually a book) that gets published after the conference and contains all papers presented. So we can't just read the booktitle value and try to look up its abbreviation. We either have to keep track of the proceedings abbreviations or use conference abbreviations and try to inject the abbreviated conference name into the proceedings name.

I'm not sure if injecting conferences names is gonna work, as it's not clear that we can extract conference names from proceedings. It's seems that there is this pattern that repeats "Proceedings of (conference name)" but I don't know if it's always the case.

koppor commented 1 year ago

One problem is that booktitle usually contains the name of the proceedings, which if I understand correctly, is the name of a publication (usually a book) that gets published after the conference and contains all papers presented.

Yes. Example:


"International Conference on Business Process Management" - https://bpm-conference.org/

Proceedings: https://link.springer.com/conference/bpm

Example paper: https://link.springer.com/chapter/10.1007/978-3-031-41620-0_1

BibTeX from springer:

@InProceedings{10.1007/978-3-031-41620-0_1,
author="Christfort, Axel Kjeld Fjelrad
and Slaats, Tijs",
editor="Di Francescomarino, Chiara
and Burattin, Andrea
and Janiesch, Christian
and Sadiq, Shazia",
title="Efficient Optimal Alignment Between Dynamic Condition Response Graphs and Traces",
booktitle="Business Process Management",
year="2023",
publisher="Springer Nature Switzerland",
address="Cham",
pages="3--19",
abstract="Dynamic Condition Response (DCR) Graphs is a popular declarative process modelling notation which is supported by commercial modelling tools and has seen significant industrial adoption. The problem of aligning traces with DCR Graphs, with it's multitude of applications such as conformance checking and log repair, has surprisingly not been solved yet. In this paper we address this open gap in the research by developing an algorithm for efficiently computing the optimal alignment of a DCR Graph and a trace. We evaluate the algorithm on the PDC 2022 dataset, showing that even for large models and traces alignment problems can be solved within milliseconds, and present a case study based on test-driven modelling.",
isbn="978-3-031-41620-0"
}

Which is "good enough"

BibTeX from the doi

@InCollection{Christfort2023,
  author    = {Axel Kjeld Fjelrad Christfort and Tijs Slaats},
  booktitle = {Lecture Notes in Computer Science},
  publisher = {Springer Nature Switzerland},
  title     = {Efficient Optimal Alignment Between Dynamic Condition Response Graphs and~Traces},
  year      = {2023},
  pages     = {3--19},
  doi       = {10.1007/978-3-031-41620-0_1},
}

authors wrong, booktitle wrong.

Let's work with the springer-provided one :)


One more example:

IEEE INDIN - https://2023.ieee-indin.org/

IEEE International Conference on Industrial Informatics, INDIN’23

Information from IEEE:

@InProceedings{Koenig2023a,
  author          = {Simone König and Birgit Vogel-Heuser and Jan Wilch and Tobias Unger and Michael Hahn and Stjepan Soldo and Oliver Kopp},
  title           = {BPMN4CARS: A Car-Tailored Workflow Engine},
  year            = {2023},
  address         = {Lemgo, Germany},
  pages           = {1--6},
  publisher       = {IEEE},
  abstract        = {The importance of high-performance computers in car networks increases to realize software assistance functions. These computers offer new opportunities for use cases in car testing and diagnostics towards vehicle software and services. Therefore, car testing and diagnostics are no longer limited to the testing and functional checking of electric and electronic components. Proven concepts from computer science, such as business process management, can be integrated into the car network to provide a uniform basis for automotive use cases. In this paper, (i) the Business Process Model and Notation (BPMN) standard is used to model diagnostic processes in future car production lines and (ii) a car-tailored BPMN workflow engine is introduced to execute the processes in a high-performance vehicular computer. The research on executable BPMN models in car networks bridges the gap between interdisciplinary business processes and execution in automotive IT systems.},
  date            = {18-20 July 2023},
  doi             = {10.1109/INDIN51400.2023.10218082},
  eventdate       = {18-20 July 2023},
  eventtitleaddon = {Lemgo, Germany},
  file            = {:https\://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10218082:PDF},
  isbn            = {978-1-6654-9314-7},
  issn            = {1935-4576},
  journal         = {2023 IEEE 21st International Conference on Industrial Informatics (INDIN)},
  keywords        = {Computers, Computational modeling, Production, Software, Automobiles, Engines, Standards, Software-defined car, Service-oriented vehicle diagnostics, Control and testing, Car-tailored workflow engine, Executing BPMN},
}

DOI information:

@InProceedings{Koenig2023,
  author    = {Simone König and Birgit Vogel-Heuser and Jan Wilch and Tobias Unger and Michael Hahn and Stjepan Soldo and Oliver Kopp},
  booktitle = {2023 {IEEE} 21st International Conference on Industrial Informatics ({INDIN})},
  title     = {{BPMN}4CARS: A Car-Tailored Workflow Engine},
  year      = {2023},
  month     = {jul},
  publisher = {{IEEE}},
  doi       = {10.1109/indin51400.2023.10218082},
}

OMG, there is so much wrong. I opened to https://github.com/JabRef/jabref-issue-melting-pot/issues/293 so that we move journal to booktitle during fetching.

booktitle for "journal" is OK.


Another example:

IEEE CBI - The Premiere Conference on Business Informatics - https://www.cbi-series.org/

Example Paper: https://ieeexplore.ieee.org/document/10218082

Explanation: TBD


So we can't just read the booktitle value and try to look up its abbreviation. We either have to keep track of the proceedings abbreviations or use conference abbreviations and try to inject the abbreviated conference name into the proceedings name.

We currently have:

image

Proposal:

2023 {IEEE} 21st International Conference on Industrial Informatics ({INDIN}) -> 21st INDIN (default) ->INDIN (shortest unique)

Heuristics:

I'm not sure if injecting conferences names is gonna work, as it's not clear that we can extract conference names from proceedings. It's seems that there is this pattern that repeats "Proceedings of (conference name)" but I don't know if it's always the case.

Sometimes yes, sometimes not. Just cover that case. A 80% solution is OK for the first round. Can still be improved.

koppor commented 1 year ago

I know that with the heuristics, one does not need an abbreviation list. However, the BPM conference shows that the abbreviation is not included in the conference title in that case. Thus, one needs the list in that case.