dokufreaks / plugin-blog

Use DokuWiki as blogging tool
https://dokuwiki.org/plugin:blog
GNU General Public License v2.0
23 stars 18 forks source link

[IGOR] New page with semicolon in title results in colon in namespace #114

Closed DB1BMN closed 1 year ago

DB1BMN commented 1 year ago

Hi,

On IGOR it seems when I want create a new page with Semicolon ';' in title, e.g. 'Test; Word' grafik

The title is created correctly but the the file name contains then a Colon ':', i.e. '%date%_test:word.txt' So a page 'word' is created in the subnamespace '%date%_test' grafik

Can anybody confirm?

Best Regards /M.

fiwswe commented 1 year ago

DW 2022-07-31a "Igor", PHP 8.0.24, Blog (Last updated on 2020-09-19)

Confirmed.

Normally my test Wiki places blog pages into namespace blog:. When using the Title Test; Issue #114 the new page was placed into the sub-namespace blog:test: (id=blog:test:issue_114).

But this does not seem to be caused by the Blog plugin. I tried placing this source code on a wiki page: [[Test; Issue #114 alternate]] and it created a link to the (non-existent) page id=test:issue#alternate. This actually shows 3 problems:

  1. Your issue that the ; is replaced with a :and thus makes the first part of the title a namespace.
  2. The # was kept, despite it having a special meaning in URLs, the "fragment" with the value of alternate in my test case.
  3. The remaining page id was shortened so much that might become ambiguous, leaving only issue

Note: It is interesting to see that the same title used in the Blog form does not cause (2) and (3). The Blog entry for Test; Issue #114 alternate is d=blog:test:issue_114_alternate. So it seems slightly different rules apply for converting a title to a page id. In the case of the Blog form the function cleanID() seems to responsible for the conversion. In the case of the link, i.e. (2) & (3), the function page_exists() might be involved, but I have not done intensive testing yet. It does split the name at the # though.

So it might be better to reopen this issue at https://github.com/splitbrain/dokuwiki/issues.

fiwswe

DB1BMN commented 1 year ago

Thanks! Yes, I tried the other special chararcters of the German keyboard at it seems a DW-specific problem to be. Will post there: https://github.com/splitbrain/dokuwiki/issues/3857

Klap-in commented 1 year ago

I think this is an issue with the blog plugin. This blog plugin creates an new entry ID with the following function: https://github.com/dokufreaks/plugin-blog/blob/2808107e28f59ce2a903399eef8bd8918b378966/action.php#L192-L214

The title is before the call of that function cleaned from the :, but not ;. In the cleanID() function that character is cleaned as well: https://codesearch.dokuwiki.org/xref/dokuwiki/inc/pageutils.php#120

The cleaning behaviour and handling of the #, % etc are the right logic for the cleanID() function.

So what is needed, is that the _newEntryID() function takes more care of these characters as these have a special meaning.

DB1BMN commented 1 year ago

Sorry, I don't think so. Try to create a new page by entering "new;page" in the address bar of the browser directly and you will end up in "new:page" i.e. a new sub-namespace.

Klap-in commented 1 year ago

That is on purpose. Here is assumed that if you type such a new or existing pageid (manually in url etc), that you could probably make a typo by swapping a ; for a :. In page id all these special characters are forbidden, the allowed set of special characters is rather small (see also https://www.dokuwiki.org/pagename#naming_conventions ). Because they are forbidden, such auto replacement of ; for a : is a safe action to do, because it should not be there.

For a new blog entry there is a form for typing a title. This title is used in two manners: 1) as first heading (=title) on the blog page, 2) as page id. For the page id we have to convert it to a cleaned name, that respects the naming convention. This is done with cleanID(), because this cleaning interpret some characters different then desired, we should already in the _newEntryID() function clean for these special characters that are interpreted wrong by the cleanID().

fiwswe commented 1 year ago

Ok, if we restrict ourselves to just the Blog plugin then https://github.com/dokufreaks/plugin-blog/blob/master/action.php#L91 seems to be the place where any cleanup should happen, I think. Currently only : is filtered as @Klap-in mentioned:

87   function _handle_newEntry(Doku_Event $event) {
88        global $ID, $INFO;
89
90        $ns    = cleanID($_REQUEST['ns']);
91        $title = str_replace(':', '', $_REQUEST['title']);
92        $ID    = $this->_newEntryID($ns, $title);

Maybe changing this to filter out other special characters such as ;, #, &, %, /, \, ? would help?

91        $title = str_replace([':', ';', '#', '&', '%', '/', '\\', '?'], '', $_REQUEST['title']);

I have added PR #115 for this proposed change.

Note: I still think the issue is more general than the Blog plugin. Basically we should differentiate between a human readable title and a page id and treat them separately. But in the case of the Blog plugin we know that we are dealing with a human readable title, not a page id. So there is no need to invent complicated logic to figure out what we are dealing with.

But these are questions for https://github.com/splitbrain/dokuwiki, not for the Blog plugin.

Also other plugins that allow page titles to be entered, such as Add New Page Plugin might also require some changes. But I have not checked this yet.

And btw: https://www.dokuwiki.org/pagename#naming_conventions should probably mention ;as an alternate namespace separator.