Open efoerster opened 4 years ago
I second this - please provide a crate on crates.io, to allow easier packaging for texlab.
Yes, I plan to do this very soon. I was holding off because of the messiness of releasing internal crates as well. But a solution emerged when reading a rust_analyzer CI script, which is to add an extra crate name prefix during publishing so people can see very obviously that they’re internal.
Hey! I also think would be great to do it, if you need some help to do it just say it!
Hi @cormacrelf! Are you finally planning to release it to crates.io? It would help a lot to the texlab! Thanks!
I don't use texlab (though looks cool), but seems like them publishing their cargo is blocked by citeproc-rs not being available.
https://github.com/latex-lsp/texlab/issues/399
Just out of curiosity, how does texlab use citeproc-rs?
Any updates on this? Crates.io support for any library makes packaging on Guix trivial, and I'm really missing my latex completions...texlab appears to use this to parse BibTeX files to correctly complete citation names.
I just noticed that texlab stopped using this crate: https://github.com/latex-lsp/texlab/commit/bfcfff518f6012ca5ed5398d66e3196d4bcb8808
This really needs to be a higher priority. It's now been more than two years since the OP opened this issue!
Other developers/projects aren't going to rely on, or contribute to, citeproc-rs if it's not available via crates.io.
I just noticed that texlab stopped using this crate: latex-lsp/texlab@bfcfff5
And it is now published on crates.io
@cormacrelf what is the current status? As soon as the 1.67 lifetime issue is fixed, wouldn't you consider this ready for a release on crates.io now?
@kmaasrud - just curious, and I have no insight into the status, but are you wanting to use citeproc-rs
with djoc
?
https://github.com/kmaasrud/djoc
Cool to see that, BTW!
@kmaasrud - just curious, and I have no insight into the status, but are you wanting to use
citeproc-rs
withdjoc
?
@bdarcus yes exactly! CSL is no doubt the future of citation processing, and while there exists a LaTeX package for it, it is written in Lua and thus not supported by Tectonic/XeTeX (which I'm using as my LaTeX backend.)
I also want the binary to be self-contained, which is not possible when using BibTeX (embedding biber would be a nightmare.)
@kmaasrud cool; let me know if and when you want feedback.
... while there exists a LaTeX package for it, it is written in Lua and thus not supported by Tectonic/XeTeX (which I'm using as my LaTeX backend.)
@kmaasrud Technically speaking, it also works with XeTeX by replacing the bibtex
(or biber
) command-line procedure with citeproc-lua
.
@kmaasrud Technically speaking, it also works with XeTeX by replacing the
bibtex
(orbiber
) command-line procedure withciteproc-lua
.
@zepinglee Indeed, but then the external dependency only shifts from biber
over to citeproc-lua
. Lua would definitely be easier to embed than biber
though, so I might need to look into that.
However, given that such a comprehensive crate like this exists, it'd be a shame to depend on Lua...
I ended up writing up a long comment related to this thread over here, a related Rust-based citation processor (with a crate!):
https://github.com/typst/hayagriva/issues/32#issuecomment-1482733264
I'll chime in here and note that I've been keeping an eye on citeproc-rs
for a while, as both a long-time contributor to the Zotero ecosystem (I'm the author and maintainer of Pyzotero) and a long-time Rust user (I've been publishing and maintaining widely-used crates since Rust 1.0 in 2015).
I would be interested in contributing / helping to maintain the crate, but a couple of things are holding me back:
In summary: I'm not interested in contributing to something that doesn't have a future as part of Zotero, but if it does, by all means let me know if you're interested in contributions (cc @cormacrelf @dstillman)
So I think this is a bit of a chicken-and-egg problem.
We don't currently have anyone able to work on citeproc-rs or even to assess patches — I could click the merge button on PRs but I wouldn't really know what I was accepting. Given the performance issues we saw and various other blockers (an incomplete list, I'm sure), and the relative completeness and suitability of citeproc-js, we're not particularly inclined to invest more in this project in order to determine if it could ever be a suitable replacement in Zotero. And without a clear future in Zotero, I'm not particularly inclined to publish it to crates.io under our name, nor do I think it would really make sense for anyone to do so without the project being under active development. As far as I know citeproc-rs won't currently even parse many/most current CSL styles, since it doesn't have full CSL 1.0.2 support, so it's likely of pretty limited use to anyone at this point.
If someone from the community was willing to help address the remaining issues and get citeproc-rs to the point where it was clear that there was a path to using it as the default processor in Zotero, we would be open to putting more resources towards its continued development as well as its health as an open-source project. But I don't know if anyone would be willing to do that without a stronger guarantee that it was going to end up in Zotero, and that's just not something we can promise at this point.
So I'm not sure how we get past that. @urschrei's offer to contribute is much appreciated, but I'm not clear if that's just about the crate or the processor more generally, and the latter is obviously a much bigger lift. There's no shortage of love for Rust, but I'm not sure that translates into love for CSL processor development.
As it is, other than the occasional issue (some of which we could work around in Zotero if we had to), citeproc-js mostly just does what we need, even on platforms like iOS where we need to call out to JS.
Sorry I don't have a better answer here.
Thanks Dan, that's helpful. Maybe a sensible approach here is a strictly time-bounded attempt on my part to get to grips with the codebase and see whether full CSL 1.0.2 support is possible without a significant time investment, since there seems to be little point in proceeding otherwise.
I agree, that's helpful. Not an ideal scenario, but you explain it well, and give developers an option.
Seems like someone, or some group, needs to get the codebase in solid enough shape that it justifies further investment, and release on crates.
I'm curious where the problem lies, given other processors have been successfully developed by single developers.
Maybe a sensible approach here is a strictly time-bounded attempt on my part to get to grips with the codebase and see whether full CSL 1.0.2 support is possible without a significant time investment ...
On this, I should emphasize that the changes in that release are trivial; mostly things like new variable names.
So if there's a problem doing this:
I am guessing the bigger issue is the performance issues Dan mentioned (and which also seems surprising).
@urschrei it might be worth looking at the even newer Haskell citeproc, if you are able to identify any processing bottlenecks and looking for ideas? That was a clean rewrite of an earlier implementation, and supposedly significantly faster.
Actually supporting 1.0.2 terms should be trivial. The larger issue there is citeproc-rs currently not accepting unknown input, which isn't really appropriate for our use case for a number of reasons.
For performance, there are two concerns: pathological cases, which hopefully can be easily addressed with some optimization, and more fundamental problems with WebAssembly performance, which wouldn't reflect on the processor itself but would affect the Zotero desktop use case (though we might be able to just run it as a separate binary). We need to evaluate the WebAssembly performance in the Zotero 7 dev build, but it will be easier for us and others to do that once citeproc-rs can parse 1.0.2 styles.
Would #13 be a potential alternative to webassembly, assuming the other details can be sorted out?
Would be cool if such a thing were compatible with the jsons served by https://github.com/jgm/citeproc/blob/master/man/citeproc.1.md; a standard JSON citation/bibliography API (see OpenAPI).
Not directly related to rust, but ...
I had an idea recently for a possibly radically simplified, extensible, next-gen CSL.
https://github.com/bdarcus/csl-next
I've sketched out the idea in a typescript model (which converts to JSON schema), but while my skills in lisp are not bad, and I've previously worked with python and ruby (and my original CSL prototype was XSLT!), I'm a total newbie with typescript and js.
If anyone here has those skills and might be interested in helping me assess the viability of the idea, I'd welcome the help.
I've made some progress on the typescript model, so today decided to see what I could do with auto-generated code derived from it.
Using the ~450 LOC of auto-generated Rust code from quicktype, with this ...
fn main() {
/// read the example json style file
let json = fs::read_to_string("src/style.csl.json")
.expect("Unable to read file");
/// deserialize the json to Rust Style struct
let style: Style = serde_json::from_str(&json).unwrap();
/// convert `style.title` back to a string
println!("{}", serde_json::to_string(&style.title).unwrap());
}
... deserializes the JSON example style file to a Rust Style
struct, and then serializes the title, with the result APA
.
The compiled binary will actually fail if the input style isn't valid!
That seems potentially really useful, and maybe a way forward for CSL in general; a new, more forward-looking model and reference implementation, whose model can be autoconverted not only to JSON Schema, but also to a wide range of implementation languages, with Rust, Swift, Haskell, and Go being the most relevant.
EDIT: a little demo repo of the codegen.
Another update, very obviously rust-related.
https://github.com/bdarcus/csln
It's a reimplementation of the csl-next draft typescript model in pure Rust, with very tight coupling (thanks to serde) between the JSON schema input and internal model.
I'm pretty confident in that model, though it would need more review, testing, and iteration for me to be fully happy with it.
I'm much less confident in my programming skills, and the fact I'm a complete Rust newbie.
But I'm absolutely serious about building this out. I just need some help.
It should compile fine using the cargo, and I have it licensed under the same terms as citeproc-rs.
It's not quite pare with the typescript processor; here's an example of where I'm at:
❯ target/debug/csln processor/examples/style.csl.yaml processor/examples/ex1.bib.yaml
Example result:
{
"smith1": {
"disamb-condition": false,
"group-index": 1,
"group-length": 1,
"group-key": "Smith, Sam:2023-10"
},
So the core of the processor at this point is a sorted bibliography vector, and this HashMap.
The next step is a function to iterate through the former and template and use the latter to generate the pre-rendered AST.
PS - just learned the typst folks are working on a 1.0 processor in Rust.
https://github.com/typst/citationberg
Feels like maybe there needs to be some collaboration across these projects.
Now that I've learned a bit of Rust so I can better understand this code base, I do think something like what I'm doing in csln and this could be aligned.
The idea is really a new input model, where the schemas are generated from the code, and so they are tightly-aligned.
Oh and removing a lot of unnecessary logic from the template language.
But it seems to me the processing model is generally pretty sound, with lots of performance optimizations. Not sure from my brief review what the bottlenecks could be, but I suspect they're resolvable.
I've forked the repo over at the CSL org, and applied Cormac's 1.67 branch, so at least it (partially) compiles :-)
https://github.com/citation-style-language/citeproc-rs
But there are 125 clippy warnings, and the citeproc-io doesn't build, and I don't myself, with my newbie skills, know how to fix it all.
I still think it'd likely be easier and more future-proof to merge what I'm doing with some of what's in this code base.
The typst folks are just about to merge CSL 1.0 support in their Hayagriva library.
https://github.com/typst/hayagriva/pull/66
It relies on their parser library:
Thanks for the great project. We are using
citeproc-rs
intexlab
and would like to publishtexlab
on crates.io soon, see https://github.com/latex-lsp/texlab/issues/152. However, this would requireciteproc-rs
to be available on crates.io, too. Do you have plans on making a release on crates.io?