fedwiki / wiki-client

Federated wiki client-side javascript as a npm module.
Other
117 stars 38 forks source link

Wiki page missing, yet its content shows up in wiki search... #284

Open cliveb opened 2 years ago

cliveb commented 2 years ago

Any idea why a wiki page goes missing?

http://clive.tries.fed.wiki/view/recent-changes/view/teradata-v-sap

The page content shows up in search.

I am using a Chromebox, as I normally do writing on wiki, with the latest ChromeOS. page missing found in search

I write this page from about 1 am to 3 am PST Feb 3, 2022.

Thanks for any pointers.

cliveb commented 2 years ago

Update. I found the page from the wiki search, by clicking the color wiki icon square in the browser (not the blue link in the browser). The page loaded, in the open window, I forked the page, but the only the page paragraph in the observable in the wiki browser forked.

Is this my user error.

I noticed another inconsistency (not 100 percent sure) I found I could fork the page without signing to my wiki account. Again the fork was only the paragraph seen in the wiki search browser preview.

Any chance the wiki server DB has reached its allocated size limit, and is now writing beyond memory? A similar pattern infamously happened to FourSquare whose database had disk space but the space was not allocated in DB, which caused the index to be searchable but the document pages were not found (I think 4sq was on mongodb)

.
forks only wiki search paragraph

paul90 commented 2 years ago

Interesting...

Looking at the list of pages, I see the slug is teradata-v-sap-. So, I suspect the title has a trailing space. On the "Deep Thoughts" page the link as a trailing space in the link.

You ask, "User error?" - well that may be, but it uncovers some inconsistency in how spaces in links, titles, and slugs are trimmed. Which will need investigating.

In the sitemap we see,

{
  "slug":"teradata-v-sap-",
  "title":"Teradata v SAP",
  "date":1643884690759,
  "synopsis":"A good example of a So...186/603 docket]"
}

which suggests that the trailing space in the title has been trimmed off at some point. Which creates a mismatch between the slug generated from the title, and the slug used to save the page - which almost certainly explains the other strangeness.

cliveb commented 2 years ago

Yes. Even wiki page edits and page forks are malformed. The trailing space appears to cause all the strangeness. Strangeness

cliveb commented 2 years ago

I safely copied the trailing space page content to a new page. Key to find the orphaned page for anyone else who stumbles on this is use wiki search to find the page in the browser and click the colored wiki square in the search not the blue page link (I think). copy trailing space page

paul90 commented 1 year ago

@WardCunningham - finally back to looking at this again. On the face of it, it looks as if it should be just a matter of trimming off any spaces. But, I see there is a test, which suggest that this is intended behaviour.

    it 'should convert spaces to dashes', ->
      s = wiki.asSlug ' now is  the time '
      expect(s).to.be '-now-is--the-time-'

I think this test is wrong, and it should be expecting now-is--the-time.


That said, without a change on the server to detect, and correct, any pages that had spaces at the beginning/end of the title will become inaccessible.

WardCunningham commented 1 year ago

I feel like in any situation where the spaces would be visible we would be justified in honoring them.

A case could be made that trailing spaces are invisible when typing into the Create New Page form from the hamburger menu and would be improved by trimming input.

WardCunningham commented 1 year ago

Noted elsewhere: http://hive.dreyeck.ch/view/slug

image
cliveb commented 1 year ago

I feel trailing spaces in the page title are inadvertent typing.

WardCunningham commented 1 year ago

Yes. I recognize that it is not what was intended. This falls into the same category as a typo that could be corrected by a spell checker. The question becomes that of when the correction is to be applied. As typed? As stored? As displayed?

I can think of two simple solutions that were available to me in the beginning:

Both of these would be an "as typed" solutions that are no longer available to us.

Now might be a good time for Paul to elaborate what he might have been thinking when he says, "change on the server to detect, and correct".

Aside: We are generous as to what we allow as a title. The conversion to slug is lossy and introduces ambiguity, some of which is desirable. A slug is converted to title by retrieving the definitive text from the page json. I say we are generous as one can find obscure unicode in titles, including double spaces and emoji, but not ]] because we offer no way to type that inside [[ and ]].

paul90 commented 4 months ago

Maybe the start/end spaces should be shown as U+2420, 'Symbol for Space' ␠