Closed RubelMozumder closed 8 months ago
Thanks for fixing the original issue that I raised this morning.
In the meantime, I also tested some more edges case and I am not sure that the behavior is the one we intended. For example, the following three xrefs would convert properly:
NXarchive(NXobject):
(NXentry)entry:
\@index:
title:
experiment_identifier(NX_CHAR):
doc:
- |
Test doc.
- | # Example0
xref:
spec: spec0
term: term0
url: url0"
- | # Example1
" xref:
spec: spec1
term: term1
url: url1"
- | # Example2
spec: spec2
xref:
term: term2
url: url2"
Especially the last one seems weird since spec
comes before xref
. On the other hand, the one that I do expect to work does not convert with AttributeError: 'NoneType' object has no attribute 'group'
.
- | # Example3
xref:
spec: spec3
term: term3
url: url3
I am therefore a bit confused as to what the rules are for using the xref feature. I thought the plan was to have it as described in example 3, i.e., you first check if xref is there and then if the other three keywords are present and properly indented. Can I just write the xref
anyway I like?
Thanks for fixing the original issue that I raised this morning.
In the meantime, I also tested some more edges case and I am not sure that the behavior is the one we intended. For example, the following three xrefs would convert properly:
NXarchive(NXobject): (NXentry)entry: \@index: title: experiment_identifier(NX_CHAR): doc: - | Test doc. - | # Example0 xref: spec: spec0 term: term0 url: url0" - | # Example1 " xref: spec: spec1 term: term1 url: url1" - | # Example2 spec: spec2 xref: term: term2 url: url2"
Especially the last one seems weird since
spec
comes beforexref
. On the other hand, the one that I do expect to work does not convert withAttributeError: 'NoneType' object has no attribute 'group'
.- | # Example3 xref: spec: spec3 term: term3 url: url3
I am therefore a bit confused as to what the rules are for using the xref feature. I thought the plan was to have it as described in example 3, i.e., you first check if xref is there and then if the other three keywords are present and properly indented. Can I just write the
xref
anyway I like?
Thank you for creating such a variety of examples. here is the docs
But the last one does not make any sense because xref should be in the upper hierarchy or administrative location. xref refers that it has some amount of reference information which are spec, term and URL probably more other info may be added later. But reshuffling in spec, term and URL makes sense.
But the last one does not make any sense because xref should be in the upper hierarchy or administrative location. xref refers that it has some amount of reference information which are spec, term and URL probably more other info may be added later. But reshuffling in spec, term and URL makes sense.
I agree that example 2 above does not make sense, but even this one converts to
This concept is related to term `term2`_ of the spec2 standard.
.. _term2: url2
Wouldn't we want to catch that and prevent the user from writing the nyaml like this?
But the last one does not make any sense because xref should be in the upper hierarchy or administrative location. xref refers that it has some amount of reference information which are spec, term and URL probably more other info may be added later. But reshuffling in spec, term and URL makes sense.
I agree that example 2 above does not make sense, but even this one converts to
This concept is related to term `term2`_ of the spec2 standard. .. _term2: url2
Wouldn't we want to catch that and prevent the user from writing the nyaml like this?
your are absolutely right. In that case we should raise and error (this is your proposal, right). Thanks for bringing it.
But the last one does not make any sense because xref should be in the upper hierarchy or administrative location. xref refers that it has some amount of reference information which are spec, term and URL probably more other info may be added later. But reshuffling in spec, term and URL makes sense.
I agree that example 2 above does not make sense, but even this one converts to
This concept is related to term `term2`_ of the spec2 standard. .. _term2: url2
Wouldn't we want to catch that and prevent the user from writing the nyaml like this?
your are absolutely right. In that case we should raise and error (this is your proposal, right). Thanks for bringing it.
I am thinking that maybe it is a good idea to enforce a certain structure:
xref
as the first string of the doc part. If the doc part does not start with xref
, we treat it as a regular text doc. So, xref
really is taken as a keyword to indicate that this doc part is about a reference.spec
, term
, and url
(there can be whitespace). Each of them should be present once and only once.spec
before xref
), we should throw an error.@RubelMozumder @lukaspie I made a suggestion for a new xref parsing function. Can you check and test it? Probably we need to add error catching when yaml cannot be parsed.
Edit: I added an additional test where we can parametrise different inputs. Maybe we should have one for cases where we wan't it to fail, too.
Thanks @domna. Exactly, what I wanted to say is that the xref part should be a valid yaml! This function works as expected for all of the examples outlined above. The tests also look good.
My proposal for the suggested use in the documentation is to use 2-space indentation. Like this, you don't need to use "
in the docstrings (which you need if you start the string at the level of - |
.
doc:
- |
Test doc.
- |
xref:
spec: spec0
term: term0
url: url0
Thanks @domna. Exactly, what I wanted to say is that the xref part should be a valid yaml! This function works as expected for all of the examples outlined above. The tests also look good.
My proposal for the suggested use in the documentation is to use 2-space indentation. Like this, you don't need to use
"
in the docstrings (which you need if you start the string at the level of- |
.
I agree, maybe we should even disallow the usage of "
entirely, because the xref
property is inherently multiline. When we add a "
it actually means that this is contained in the yaml text as is. We can simply remove this by removing the strip('"')
for the clean_txt
. We just have to update the docs accordingly.
2-space indentation should work. As long as all the keys are lined up any indentation should work.
However, we still have the problem if there are duplicate keys. While yaml says keys should be unique pyyaml just takes the last value. It's a known bug/feature https://github.com/yaml/pyyaml/issues/165 (the community seems to not have agreed yet whether this is a bug or a feature). But I think we could add a simple check (split to lines and check if len(flatten(xref_dict))
is the same - that way we also avoid nested keys).
However, we still have the problem if there are duplicate keys. While yaml says keys should be unique pyyaml just takes the last value. It's a known bug/feature yaml/pyyaml#165 (the community seems to not have agreed yet whether this is a bug or a feature). But I think we could add a simple check (split to lines and check if
len(flatten(xref_dict))
is the same - that way we also avoid nested keys).
I added some checks for edge cases. Should be all fine now (except that removing of "
lets all the other tests fail)
Do we want to fail if the xref docstring starts with "
or do we want to allow both (current status)?
Edit: I removed the support for it, because it also introduces artifact's for other docstrings. Starting and ending "
are carried over even though they should be kept according to the yaml rules.
However, we still have the problem if there are duplicate keys. While yaml says keys should be unique pyyaml just takes the last value. It's a known bug/feature yaml/pyyaml#165 (the community seems to not have agreed yet whether this is a bug or a feature). But I think we could add a simple check (split to lines and check if
len(flatten(xref_dict))
is the same - that way we also avoid nested keys).I added some checks for edge cases. Should be all fine now (except that removing of
"
lets all the other tests fail) Do we want to fail if the xref docstring starts with"
or do we want to allow both (current status)?Edit: I removed the support for it, because it also introduces artifact's for other docstrings. Starting and ending
"
are carried over even though they should be kept according to the yaml rules.
I see there are a few tests for indivisual correct and wrong one, thanks @domna.
I added the artifact "
around the text so that any de-indentation does not affect yaml rules, as each line contains a :
, and therefore yaml user should not bother of it.
@RubelMozumder @lukaspie I made a suggestion for a new xref parsing function. Can you check and test it? Probably we need to add error catching when yaml cannot be parsed.
Edit: I added an additional test where we can parametrise different inputs. Maybe we should have one for cases where we wan't it to fail, too.
Sorry, before getting there you finished it.
I see there are a few tests for indivisual correct and wrong one, thanks @domna. I added the artifact
"
around the text so that any de-indentation does not affect yaml rules, as each line contains a:
, and therefore yaml user should not bother of it.
Doesn't matter for yaml as long as it's inside doc: |
. Then basically everything is allowed.
Changes:
term
will be accept anything in-between aspace
andstandard
Fixes #115