Open bdarcus opened 1 year ago
Here I wonder if the absence of space between the Latin and Chinese names is really intended?
[FAILED] test/csl/name_EtAlWithCombined.txt --- expected +++ actual @@ <div class="csl-entry">John Doe。</div> - <div class="csl-entry">——著,Ziggy Zither等點校。</div> + <div class="csl-entry">——著,Ziggy Zither 等點校。</div>
There is no universal standard about the space character between Han and western characters but omitting it is one of the accepting styles.
In typesetting, there is usually a 1/6 to 1/4 em space (width) between Han and western glyphs. Many softwares like Adobe Indesign, Microsoft Office, and TeX with appropriate packages can handle this automatically. However people do not form a consistent style of mixed Han and western in the input text characters. As far as I know, companies like Apple and Google tend to insert these space character in Chinese but omit them in Japanese (compare https://www.apple.com.cn/ and https://www.apple.com/jp/).
So what do you suggest on that one, @zepinglee? Leave as is?
Aside: the others ones are all also, or maybe more, about the spec.
So what do you suggest on that one, @zepinglee? Leave as is?
Aside: the others ones are all also, or maybe more, about the spec.
Yes. If I remember correctly, the output of citeproc-js
is in this compact style.
Wouldn't it be sensible to have a set of conformance tests that only test behavior that is required by the spec? (And not also behavior that is not spec but happens to have been implemented by citeproc.js?) What's not controversial is that this issue is not in the spec.
Right, so really this is more a spec issue that anything? E.g. I should move this issue there?
To update after the issue transfer, the above tests are sensible probably, but the behavior not mentioned in the spec.
So the spec should be amended accordingly.
And/or, tests removed from the test suite.
If there's not a standard way of doing this, then I'd suggest leaving it out of the spec (and therefore the tests).
So reading between the lines, are you recommending dropping all four from the tests, and be done?
Hard to say for sure without knowing more about what these tests are intended to test for. But the general principle I'd favor would be not to test things that aren't part of the spec -- or to separate out these tests into a separate test suite for citeproc.js.
I think that's a good idea. @Jason-Abbott also ran into a bunch of these for his swift citeproc implementation (see recent posts over on discourse).
There are 3-4 different categories of tests, though, that we need to treat differently
name_EtAlWithCombined.txt
would seem to fall in that category as do, possibly, some of the others above. I'd tag/file those as "recommended" perhaps? Following these isn't necessary to produce correct citation output, but they do raise relevant considerations and provide a reasonable default.I did add ability to include metadata in the tests, which could be useful in categorizing and explaining.
https://github.com/citation-style-language/test-suite/issues/51
What if we just have a "triage" subdir of the test-suite, and so mark tests like these in need of some resolution, and move these tests there? Can also tag them.
What if we just have a "triage" subdir of the test-suite, and so mark tests like these in need of some resolution, and move these tests there? Can also tag them.
I'm fine with that -- always in favor of immediate improvements -- but I do think we should have a plan for how the triage categories look. If my above suggestions are fine, we can go with those, but obviously I've never written a processor, so make no claims that these are right.
@fbennett does that sound OK to you?
A triage folder or other indicator seems sensible to me as well. At the moment, the Swift CSL implementation I’ve been working on is set up to chug through every test in the suite, which may give me reason for some PRs to add whatever indicator is chosen here.
(My test progress slowed a bit as I had some day-job travel and refactoring but about to re-engage.)
I'll plan to follow my suggestion, unless I hear otherwise. It's easy enough to adjust these either way though.
Is your implementation open source @Jason-Abbott?
Is your implementation open source @Jason-Abbott?
It isn’t currently. I will open-source some or all of it as I finalize some product ideas.
@adam3smith , @bdarcus following up here after my learnings on citation-number
— I would guess those tests that depend on unique and undocumented behaviors of <key variable="citation-number"/>
belong in this triage
or similar sub-directory.
I would even argue that these behaviors should never be added to the specification since, although I understand the need for and relative simplicity of the original code change, it breaks the meaning of sort keys. So perhaps an off-spec
or citeproc-js
sub-directory …
No strong feeling from me. I’d only like to assist future processor writers or debuggers 🙂
My POV is anything that isn't in the spec doesn't belong in the test suite, and if there's any uncertainty about what to do with such a test, it should go in "triage."
I like your idea of an "off-spec" subdir for ones we agree shouldn't be in the spec (and your explanation makes sense to me on these). Perhaps we could just add that, and when moving tests there, include an explanation of why?
Care to PR that too?
Assuming @adam3smith agrees ...
Yeah, I’d be happy to do a PR. I’ll need to review those citation-number
tests again since a few don’t require the off-spec behavior … so, some time this week.
I would even argue that these behaviors should never be added to the specification since, although I understand the need for and relative simplicity of the original code change, it breaks the meaning of sort keys. So perhaps an
off-spec
orciteproc-js
sub-directory …
I agree. Those sort_
tests are confusing to me and I skipped them in CI.
The citepoc-js
has a local directory (https://github.com/Juris-M/citeproc-js/tree/master/fixtures/local) of tests specific to itself. Perhaps we can move them there?
I'm still struggling with how to think about things were the test behavior may be desirable but probably shouldn't be spec'd. To take this test/behavior as an example: Frank added the behavior to citeproc-js (and then added a test) based on a user request and to allow for something that would otherwise not be possible, re-interpreting a syntax that is valid according to the specs (sorting by citation-number descending) but (literally) paradoxical. As I mention to Jason above -- a processor would almost certainly be fine without meeting this test. I don't necessarily want to burden the spec with micro-managing behavior like this. At the same time, it seems wasteful to throw out the collected experience represented by tests in this category. That's why I'm suggesting a recommended
category above. Not sure if that's the right way to go about this, but I also worry that both alternatives I see -- writing every edge case into the specs in great detail or completely deprecating tests of this nature -- have significant downsides. I understand Jason as expressing similar ambivalence.
To move forward without getting bogged down on this, I think we either
I guess wrt moving them to citeproc-js local, I'm not sure what that implies in terms of advice to citeproc developers?
I hadn't noticed, @zepinglee, that citeproc-js has its own local tests built in this same way. I see sections I don't recognize in some of the tests there so apparently the test harness is also extended there.
Even so, I tend to agree with @adam3smith insofar as the impact of moving tests from here to there is unclear, an intermediate step of moving them into a subfolder here is safer, until that can be better evaluated.
Personally, my goal is to deliver a processor that behaves as Zotero and other users expect, which could well mean it incorporates citeproc-js idiosyncrasies. That's why I've so far adapted my code to all the tests here.
Seeing all those other local citeproc-js tests adds a wrinkle I'll need to think about. I'm fairly new to this CSL space. I imagine there's a conceptual distinction between off-spec tests here and local tests in citeproc-js, though I sure can't say what it is yet.
If it's at all helpful, documentation on CSL-M idiosyncracies is linked from the Jurism website.
https://juris-m.github.io/cslm-docs/
Frank
On Tue, Jun 13, 2023, 13:14 Jason Abbott @.***> wrote:
I hadn't noticed, @zepinglee https://github.com/zepinglee, that citeproc-js has its own local tests built in this same way. I see sections I don't recognize in some of the tests there so apparently the test harness is also extended there.
Even so, I tend to agree with @adam3smith https://github.com/adam3smith insofar as the impact of moving tests from here to there is unclear, an intermediate step of moving them into a subfolder here is safer, until that can be better evaluated.
Personally, my goal is to deliver a processor that behaves as Zotero and other users expect, which could well mean it incorporates citeproc-js idiosyncrasies. That's why I've so far adapted my code to all the tests here.
Seeing all those other local citeproc-js tests adds a wrinkle I'll need to think about. I'm fairly new to this CSL space. I imagine there's a conceptual distinction between off-spec tests here and local tests in citeproc-js, though I sure can't say what it is yet.
— Reply to this email directly, view it on GitHub https://github.com/citation-style-language/test-suite/issues/53#issuecomment-1588498999, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAASMSWNAI3SJ7OA5XE62ELXK7SL3ANCNFSM6AAAAAAWY4WY2M . You are receiving this because you were mentioned.Message ID: @.***>
@jgm earlier posted this over at the Discourse, noting some tests he found problematic.
https://discourse.citationstyles.org/t/upcoming-csl-meetup-context/1767/12
Any suggestions on what we do with them?
The questions and commentary below are John's.
Here are some examples of failures from citeproc-hs. In each case I may have simply missed a relative part of the spec.
Does the spec specify these behaviors for quotes?
Does the spec say to do this? (As noted in my last mail, it’s not always desirable.)
Does the spec specify these things?
Here I wonder if the absence of space between the Latin and Chinese names is really intended?