Closed smithfarm closed 4 months ago
Phew this will break a lot of external links right?
Did we ever say that our links live forever? All the other SUSE documentation had to take this poison pill, so why not ours?
Did we ever say that our links live forever?
No but that doesn't mean we should not give a 💩 either. obs-landing knows redirects. At least for the sections etc. (everything but anchors) we should do those. Like
/help/manuals/obs-user-guide/par.first_steps /help/manuals/obs-user-guide/par-first-steps
@hennevogel OK, I'll create a companion PR at obs-landing.
Did we ever say that our links live forever? All the other SUSE documentation had to take this poison pill, so why not ours?
That's true, we faced this problem in the past. We mitigated this by adding redirect rules in our Apache .htaccess
file. For the future, we introduced a "don't change your ID!" rule.
Still we face this problem from time to time when some clever managers want to change the product names. :wink:
If IDs are part of URLs, there is not much you can do to avoid this problem. Sometimes it's inevitable. You can make people aware of that to mitigate the issue.
For the record, our styleguide contains a section about Identifiers. Just to give you an idea how we use and create IDs.
That's true, we faced this problem in the past. We mitigated this by adding redirect rules in our Apache .htaccess file.
As already suggested by @hennevogel I have opened a companion PR with corresponding redirects: https://github.com/openSUSE/obs-landing/pull/466
I have opened a companion PR with corresponding redirects: openSUSE/obs-landing#466
Yes, saw it. Just wanted to give you some context from our side. :slightly_smiling_face:
We seem to have reached a consensus that migrating to Geekodoc is the way to go, so as long as there are no objections to the additional changes I have added, this PR seems ready to go in (together with its counterpart PR over at obs-landing).
I've did a quick validation check and there is more work ahead. :wink:
I forgot to mention that the xml:id
is only one part. You also need to modify the value of the linkend
attribute. It appears inside a <xref linkend="..."/>
element. These are internal cross references to other parts of the document. I'm sorry for the additional work.
@smithfarm Nathan, I've experimented a bit to make this work more efficient. After I was unsuccessful with sed
and xmlstarlet
, I wrote a small Python script:
# modifydocbook.py
import re
import sys
pattern = re.compile(r'(xml:id|linkend)="([^"]+)"') # Regular expression pattern
def modify_docbook_xml(file_path):
"""
Read a DocBook XML file by replacing dots and underscores with dashes
in xml:id and linkend attributes.
Prints the modified content to standard output.
Args:
file_path: Path to the original DocBook XML file.
"""
with open(file_path, 'r') as fh:
for line in fh:
match = pattern.search(line)
if match:
# line = line.rstrip()
attr_value = match.group(2).replace('.', '-').replace('_', '-')
replacement = rf'\1="{attr_value}"'
line = pattern.sub(replacement, line)
print(line, end="")
if __name__ == "__main__":
modify_docbook_xml(sys.argv[1])
If you install sponge
(package name is the same) you can do it in one go:
$ for xml in xml/*.xml; do
python3 modifydocbook.py $xml | sponge $xml
done
After this process, there is one ID (co-obs-sserv-struct-name
) which occurs twice. If I fix that as well and correct the ROOTID
in the DC files, both admin and user guide are valid.
You also need to modify the value of the linkend attribute.
@tomschr Thanks! But I believe I did that already... Do you see any linkend attribute value that I missed?
Thanks! But I believe I did that already...
daps thinks otherwise. :wink: It reports some xml:id and linkend errors. I count 14 files that are modified after I apply my script. :slightly_smiling_face:
@tomschr Instead of me duplicating your work, could you just push the changes right into the PR? Or just push a branch with the commit and I'll cherry-pick it in...
@tomschr The odd thing is, Geekodoc no longer complains about the .
and _
characters in the xml:id
attribute values.
I mean, I'm trying to get to them all, but I found I missed some - yet Geekodoc (daps) did not complain.
daps thinks otherwise. 😉 It reports some xml:id and linkend errors.
Where? Here in the PR it says this:
[INFO] Building DC-obs-all
[DEBUG] Validate inside container docker exec fd9b38640981f186599fbbaca8e9a91df877a2104d61d86db9027b55b2ae5468 daps -d /daps_temp-dkjdVP1OK7/source/DC-obs-all validate
[DEBUG] Validation for DC-obs-all result was 0
[DEBUG] Checking validation result...
daps -d /daps_temp-dkjdVP1OK7/source/DC-obs-all html
Your output documents are:
/tmp/daps2docker-2DhNs4mF/obs-all/html/obs-all/
I thought this meant that daps thinks the XML is valid.
daps thinks otherwise. 😉 It reports some xml:id and linkend errors.
Where? Here in the PR it says this: ... I thought this meant that daps thinks the XML is valid.
Use the option -vv
and look at the beginning. You will see probably something like this:
DOCBOOK_VERSION: 5
DOCBOOK5_RNG: /usr/share/xml/docbook/schema/rng/5.2/docbookxi.rng
If that's the case, you still validate against DocBook 5. However, DocBook doesn't have any restriction regarding IDs.
To really validate against Geekdoc, you need daps to tell what schema to use. In your ~/.config/daps/dapsrc
file, add the following line:
DOCBOOK5_RNG_URI="urn:x-suse:rng:v2:geekodoc-flat"
Then validate again and you should hopefully see some messages. For me, it looks like this:
[ERROR]: The document contains XML errors:
/home/toms/repos/GH/opensuse/obs-docu/build/.profiled/bogus_opensuse_novell/book-obs-user-guide.xml:6:217: error: value of attribute "xml:id" is invalid; must be an XML name without colons matching the regular expression "[\-0-9a-zA-Z]+"
@tomschr Instead of me duplicating your work, could you just push the changes right into the PR? Or just push a branch with the commit and I'll cherry-pick it in...
@smithfarm Absolutely! I just didn't want to intervene with your work so my next questions would be exactly that. :+1: I've committed the changes in commit 05a8958, see the commit message for details.
@tomschr Thanks! Much appreciated.
For SEO optimization, we need to avoid the use of dots ('.') and underscores ('_') in our xml:id values.
To be merged together with companion PR: https://github.com/openSUSE/obs-landing/pull/466