fiduswriter / biblatex-csl-converter

A set of JavaScript converters: bib(la)tex => json, json => csl, and json => biblatex
GNU Lesser General Public License v3.0
34 stars 10 forks source link

Feature request: mark commands that were ignored #82

Open retorquere opened 7 years ago

retorquere commented 7 years ago

Would it complicate the parser significantly to get ignored LaTeX commands in the warnings or errors? That would allow me to add a notification to references such as these so people would know there is likely cleanup work to be done.

johanneswilm commented 7 years ago

I am actually not sure. if there are commands we don't know about -- how can we know for certain that they are commands? In general I believe we just ignore all such commands, so any command definition would be ignored.

johanneswilm commented 7 years ago

Would it not be just as practical for you to scan for definitions in the preamble, and if present give a warning?

retorquere commented 7 years ago

That would be useful in addition to command flagging I think. It would also be useful to know if someone uses a straightforward latex command we don't support -- something like \vphantom for example.

I had forgotten how this parser works -- I'm stil thinking in terms of my old parser, which scanned for commands and then looked whether they were supported or not, applying it to its arguments if it was, and just outputting the arguments if not; this parser matches command-plus-arguments so "I know this is a command but I don't know what to do with it" isn't really a concept in this parser (right?)

Anyhow, I think this is only useful if it's really easy to add cleanly. Bib(La)TeX parsing by anything else than latex/biber is going to be best-effort by its very nature.

johanneswilm commented 7 years ago

Yeah, parsing like that was a lot slower (we had that until someone from the Netherlands came along and complained that he couldn't upload 5000 bibtex references into FW. ;-) ). What one could do would be a post-mortem analysis -- basically looking whether anything looking like a command is still in the output JSON. We could have a flag for enabling/disabling the analysis, because it's probably not something everyone would need.

retorquere commented 7 years ago

I may know that guy :laughing:

A post-mortem would be plenty useful, but again, really if it's easy to add cleanly. This is the first time I've seen a request like this in all the years I've done BBT, and BBT would previously just pass unsupported commands into the references as text (so \vphantom{bla bla} would end up in the reference as vphantombla bla).

Currently the parser doesn't pass unrecognized commands into the output though. The Franc{\ocirc{u}} from the sample ends up as Franc.

johanneswilm commented 7 years ago

One should be able to add such an analyzer at https://github.com/fiduswriter/biblatex-csl-converter/blob/master/src/import/biblatex.js#L726 before \u0871 are being turned into backslashes. I don't think we would ever need such a parser though.

retorquere commented 7 years ago

But if that's the case, wouldn't I have seen Francocircu as the output? Would that line see the ocirc anywhere?

johanneswilm commented 7 years ago

You're right. For literal fields, we skip the command here: https://github.com/fiduswriter/biblatex-csl-converter/blob/master/src/import/literal-parser.js#L138 . Feel free to experiment with adding something there.