Closed dieghernan closed 1 year ago
@dieghernan I think one huge component that is missing from Phase 2 is addressing the underlying memory leaks. Or is this what "alternative solutions" means?
I should note that work was initially started by @matthewwiese in #40.
I did start work on fixing the rchk warnings some months back and made good progress, although I never updated the PR because I wasn’t able to eliminate possible protection stack imbalance
completely. A lot of the warnings in #33 can be mitigated by avoiding UNPROTECT_PTR(s)
in favor of respecting the stack via UNPROTECT(n)
, in exchange for slightly more verbose logic in the grammar section and elsewhere. Despite these improvements, protection stack imbalance warnings would persist. My initial hunch was that it was merely a false positive and that rchk was getting "confused" by the generated parser - from rchk USAGE:
There are still some false alarms and bailouts for code that is still too complicated, and indeed there are not-useful reports for functions that have protection stack imbalance by design.
However, my changes still didn’t solve problems like #42, which suggests something more fundamental. I can update the PR with what I have if it'll be of benefit, but unfortunately I don't have the time to devote the proper attention to this.
I agree that any project to improve the package is moot unless the memory bugs are addressed.
Thanks both for your comments. For clarity, what I am talking about in Phase 2 is to rewrite the package using base R rather to fix the current one, i.e. dropping the C code.
This is kind of risky, so it is important to have a full battery of tests covering the most extreme cases.
I am aware that Phase 2 may not be satisfactory, but even in the case you decide not to adopt it (it would be fine for me) I think it is worth to try
For what it's worth, I think rewriting in R is a good idea. It eases future maintenance and allows existing users of the package to keep on working with only a minor performance penalty. If for some reason the parsing of BibTeX files is a bottleneck, one could always switch to rbibutils which appears to roll a hand-written C parser.
I don't know whether you're a student @dieghernan, but this could make for a great GSoC project if @coatless is up for it.
Closed as #47 is now in and the package is back on CRAN.
Hi
Following #28 find here a proposed roadmap for improving the package. This is to be discussed:
Phase 1: Current state
Add test ideally based in snapshot (testthat >= 3):
Tests
[x] Move to testthat > = 3
[x] Improve current testing, including test for checking already closed issues.
[x] Add separated test for advanced features on bibtex (crossref, use of STRING)
Phase 2: Alternative solution
[x] Implement and Test alternative solution on separate branch
Phase 3: Agree on changes
Nice to have
bibtex.R
Happy to get feedback on this