Closed yuxqiu closed 11 months ago
How should we handle strings like this A and {<some whitespaces here>} and B
?
I think it makes sense to treat it as an undefined behavior, even though biblatex treats it as a name with a blank character.
@book{emptystring,
title = {Design patterns: elements of reusable object-oriented software},
author = {Gamma, Erich and { } and Helm, Richard and Johnson, Ralph and Vlissides, John M.},
year = {1995},
}
Generated .bbl
file:
\bibitem{emptystring}
E.~Gamma, { }, R.~Helm, R.~Johnson, and J.~M. Vlissides, \emph{Design patterns: elements of reusable object-oriented software}, 1995.
The current implementation of Person::parse
treats it as an empty name.
(Not directly related to this PR) How should we handle multiple consecutive whitespace characters in a verbatim?
The biblatex package will treat all these consecutive blank characters as one blank character. However, the current implementation of crate retains all characters in the verbatim script. This is why the title is rendered incorrectly, as shown in https://github.com/typst/hayagriva/issues/47.
Thank you for your PR! I think it makes sense to treat verbatim blocks with whitespace between ands as UD (because it isn't defined anywhere afaik). The multiple whitespace characters should likely be collapsed if that's what biblatex
/biber
does.
Thank you for your PR! I think it makes sense to treat verbatim blocks with whitespace between ands as UD (because it isn't defined anywhere afaik). The multiple whitespace characters should likely be collapsed if that's what
biblatex
/biber
does.
Thank you for sharing your opinion! I think the verbatim issue should be investigated in detail and fixed in another PR later. So, this PR is now ready for review.
Mysteriously, CI keeps failing at date parsing (I haven't changed this part of the code). On my computer, all tests pass successfully.
Edit
This may be related to the new version of the chrono
crate that was released a few hours ago. We should add Cargo.lock
to the repository.
Will try to fix a tiny error later. So, I converted it to a draft now.
I skimmed your code and reworded a few comments. Looking forward to your fix!
I chose to revert to the previous splitting strategy (splitting by keyword and then checking for surrounding characters) instead of splitting by whitespace and then checking for ==keyword
. The latter made the new test on verbatim fail, as it could not clearly handle the whitespace between verbatim chunks.
What I do now is to try to keep all the whitespace between fields and trim only the beginning and end of the data in the latest
vector before pushing it to the out
. I think this makes the function more general-purpose, and it leaves the responsibility of handling non-leading and non-trailing whitespaces to the call site.
Related Issue
Try to fix #28.
What are the changes/fixes in this PR?
This PR adds support for multiline author field.
In p. 16 of the BibLaTeX manual, it states that:
However, the old implementation assumes that only
<space>and<space>
is valid, which causes the problem in the reference issue. The correct approach is that we consider a split valid only if the character beforeand
and the next character afterand
are whitespace (either ASCII or Unicode whitespace is a valid option).Leading or Trailing "and"
When "and" is encountered at the beginning or end of a person's name, and there are no other possible splits (no surrounding whitespaces), it should be considered as part of the name.
Consecutive "and"
When there are multiple consecutive "and" separated by spaces, we should treat the name between the two "and" as empty.
Experiment
I tried to test the above claim in latex with the following files, using the bib style file downloaded from here:
I get the following outputs: