moinwiki / moin

MoinMoin Wiki Development (2.0+), unstable, for production please use 1.9.x.
https://moinmo.in/
Other
308 stars 92 forks source link

Should wiki-root be included in href? #167

Closed ThomasWaldmann closed 1 year ago

ThomasWaldmann commented 12 years ago

Original report by RogerHaase (Bitbucket: RogerHaase, GitHub: RogerHaase).


See the mark_item_as_transclusion method within converter/html_out.py for an example of the problem and a workaround.

This problem arises only when the wiki is not run at server root.

Should href's within the emeraldtree DOM be of the form /mywiki/myitem or just /myitem? Currently href's for pages take the latter form and href's for objects take the former form.

  1. Decide which form is preferred and correct the non-conformers.

1a. If the preferred form does not include the wiki-root, define a method for obtaining the wiki-root.

bylsmad commented 1 year ago

I've added a spider to check the src, href, data-href, and data attributes on the site and pointed it at my server which is running moin behind apache with wiki_root = devwiki

three issues from the spider:

  1. found that data-href is showing a 404 on the same places where the src is showing a 404, in particular these are for docbook transclusions, docboook_in transforms transclusions into moin_page.object attempting to create the +get link manually - proposed fix is to have docbook_in create xinclude.include elements instead as moinwiki_in does
  2. on +meta pages, there are a couple of bad "Item Links:", have not looked into fix for this, thinking this would better be placed in a separate issue as not relating to the wiki_root
  3. moin dump-html produces src links which have subdirectories e.g. src="+get/issue_167_test/photo.jpg" but on disc, the subdirectory structure is flattened +get/issue_167_test(2f)photo.jpg, have not looked into this, thinking also better in a separate issue as wiki_root is not involved

one simplification

  1. wiki_root is most commonly added in link.ConverterExternOutput which gets the wiki_root in a call to flask.helpers.url_for this seems like the right way to add wiki_root, proposal is to alter mark_item_as_transclusion so it adds a Iri('wiki:///SomeObject?do=show') instead of a string path and then include data-href attributes in the conversion done by link.ConverterExternOutput

samples:

  1. https://picklepartysalon.com:8081/devwiki/docbook
  2. https://picklepartysalon.com:8081/devwiki/+meta/AjudaNaFormata%C3%A7%C3%A3o
  3. https://picklepartysalon.com:8081/devexport/HTML/issue_167_test
  4. analysis of the xml transformation flow: https://picklepartysalon.com/wiki/moin#image_src_atttribute
RogerHaase commented 1 year ago
  1. pull request welcome.

  2. is issue #1259, fixed markdown, currently working on rest. I thought docbook was OK, will check again.

  3. was easiest way to get html-dump working. Better ideas welcome.

  4. pull request welcome.

Don't be shy, new issues and pull requests are welcome. Thanks for your work todate.

RogerHaase commented 1 year ago

Reopening because scrapy issue is still active.

https://github.com/scrapy/scrapy/issues/5850

RogerHaase commented 1 year ago

Thie work to date using Scrapy is difficult to review because of the number of commits and not related to the original issue of "167 Should wiki-root be included in href". It would have been better to open a new issue for the addition of scrapy.

It is clever to run scrapy as a part of the pytest procedure, but I wonder if that is the best place. Having to start a server running at 9080 before running tests seems an unusual requirement. Wiki Admins that install a future version of moin from pypi are unlikely to run tests frequently/ever but may want a means of checking for broken links.

It could be added as a sibling to /moin/contrib/loadtesting, but then it would not be available to future wiki admins that install moin from pypi.

Another alternative would be to add it under src/moin/scripts/sitetesting and add a new command moin find-broken-links. (after cloning your repo, the entire src/moin/scripts directory is missing?)

The crawl.csv and crawl.log output seems hidden under /src/moin/_tests/sitetesting/scrapy among source code. The server.log is created in /src/moin/_tests/sitetesting/server.log among source code. All of these should go in the instance root as a sibling to wikiconfig.py.

My server/log file has several SyntaxErrors:

--snip--
  File "C:\git-bylsmad\moin\.tox\py310\lib\site-packages\flask\cli.py", line 123, in call_factory
    return app_factory(*args, **kwargs)
  File "C:\git-bylsmad\moin\.tox\py310\lib\site-packages\moin\app.py", line 50, in create_app
    return create_app_ext(flask_config_file=config,
  File "C:\git-bylsmad\moin\.tox\py310\lib\site-packages\moin\app.py", line 99, in create_app_ext
    app.config.from_pyfile(path.abspath(flask_config_file))
  File "C:\git-bylsmad\moin\.tox\py310\lib\site-packages\flask\config.py", line 120, in from_pyfile
    exec(compile(config_file.read(), filename, "exec"), d.__dict__)
  File "C:\git-bylsmad\moin\src\moin\_tests\sitetesting\wikiconfig.py", line 1
    ../../config/wikiconfig.py
    ^
SyntaxError: invalid syntax

The missing links ending in Discussion should be ignored/accepted as these are links to create a Discussion subpage should none exist. The 'Discussion' variable needs to be retrieved from wikiconfig, see /src/moin/config/default.py supplementation_item_names in case some wiki admin changes it to something other than English.

If you are running under the practice that Git commits are cheap and you should commit whenever you have a bit of code you like, then you should use git rebase to squash the commits and cleanup the commit message. This would make it easier to review and easier to maintain. It would be nice to have one commit resolve one issue, but this is frequently not achievable.

Creating a new issue #1375, add future scrapy activity there. Closing this issue as complete.