PASTAplus / web-x

The EDI Website Project
Apache License 2.0
1 stars 1 forks source link

Removing '.md' strings from template files breaks some URLs #45

Open clnsmth opened 1 year ago

clnsmth commented 1 year ago

In content-x, content creators use relative links between markdown templates to simulate navigation among cross-referenced web pages. Before releasing new content, the relative links are converted from the GitHub context, to the web-x context. One step in this process is the removal of '.md' strings from template file contents. See:

https://github.com/PASTAplus/web-x/blob/198ac37d78dec3f134ef73300f43812d594e6e10/utilities/build.py#L262

This is all fine and good until a markdown document is part of the target URL, in which case build.py strips the '.md' and web-x readers get a 404 when clicking on the link.

I was unsuccessful in attempts at solving this issue via regex with a backwards looking conditional (i.e. don't remove '.md' when part of a URL).

Other ideas?

servilla commented 1 year ago

Is there an example of a template file containing a URL with an embedded markdown file reference we may use as a test case?

servilla commented 1 year ago

One thought is to use a character entity encoding for the dot "." in the .md string so the replace function does not find it. It's a bit kludgy, but could/should work.

servilla commented 1 year ago

In the URL you would replace . with %2E, so the markdown extension would transform from .md to %2Emd. This should be an acceptable encoding for URLs and obfuscate the .md to the replace function.

clnsmth commented 1 year ago

Nice @servilla. The %2E kludge works.

For an example, see this news article, specifically the hyperlinked "COUNTER Code of Practice for Research Data in Repositories".

I've noted this work around in the content-x contributing guidelines under the "Formatting and style" section.

Call this issue "closed" for now?