Possible improvements for WikiWord linking

GoogleCodeExporter commented 9 years ago

Allow the user to give a list of WikiWords to the parser (e.g. the list of 
Wiki pages belonging to the project). The parser would link those wikiwords 
even if they are not [InBrackets]. e.g.

parser.wikiWords( [... list of words...] );
parser.addWikiWord( string );
var list = parser.wikiWords();

Another idea is to mark all non-bracketed WikiWords with a '?' (like GoCo 
does for WikiWords which do not point to a local wiki page). The point 
would be to tell the user, "hey, bracket that WikiWord!" (IMO, only 
bracketed/marked-up stuff should be linked by GoCo, and users should be 
explicit about what they want linked.).

Another idea would be passing objects in the form:

{word:WikiWord,
 linker: function(Word){} // returns an HTML link.
}

the point would be to allow 3rd-party apps (like our own previewer/editor) 
to install customized link handlers. We do this for the previewer to allow 
the previewer to load local Wiki pages directly instead of letting the 
browser try to load them via http.

Another idea would be to do what GoCo does and just link anything which 
looks like a WikiWord. That, of course, would be most compatible.

Original issue reported on code.google.com by sgbeal@googlemail.com on 23 Apr 2010 at 3:27

GoogleCodeExporter commented 9 years ago

Patch that adds WikiWord support, as described in description's last paragraph: 
"do what GoCo does and just link anything which looks like a WikiWord. That, of 
course, would be most compatible."

Original comment by atte.kemppila on 20 Jul 2011 at 4:48

Attachments:

WikiWord.patch

GoogleCodeExporter commented 9 years ago

Great!

i'll get this integrated ASAP (or Fabien - if you're listing?)

i'm impressed that you could grok the parse/buffer manipulation code!

Reminder to self: expand createLink() to know a given WikiWord links to an 
existing page. We should publish the page list to external clients (if that's 
not already done) so that clients which override createLink() (like my main 
wiki does!) can take advantage of this.

Original comment by sgbeal@googlemail.com on 20 Jul 2011 at 5:42

GoogleCodeExporter commented 9 years ago

@Fabien: nevermind. i'm already working on it.

Original comment by sgbeal@googlemail.com on 20 Jul 2011 at 5:47

GoogleCodeExporter commented 9 years ago

i just tried this out on the demo site and there's a slight bug: !WikiWord is 
no longer handled correctly.

That seems to be a side-effect of the this.rx.wikiWord change, which 
(apparently) causes the !UnlinkedWikiWord block (just above the new block) to 
not recognize the word.

i'll see if i can fix this quickly. Or if you've got an idea how to fix the 
wikiWord regex, please let me know.

Original comment by sgbeal@googlemail.com on 20 Jul 2011 at 5:56

GoogleCodeExporter commented 9 years ago

i've changed wikiWord to:

        wikiWord: /\b((?:[A-Z][a-z]+){2,}\w+?)\b/,

but now i have the problem that this:

`!``ExplicitNonWikiWords`

(from the demo page)

doesn't parse: it comes out with ExplicitNonWikiWords hyperlinked.

Original comment by sgbeal@googlemail.com on 20 Jul 2011 at 6:11

GoogleCodeExporter commented 9 years ago

I was wondering how fast someone would find some bug in it. :) 

Sure, I can take a look (probably not until tomorrow). But can you give me a 
pointer which demo page/site are referring to?

Original comment by atte.kemppila on 20 Jul 2011 at 6:19

GoogleCodeExporter commented 9 years ago

Hmm... Perhaps it would better to just check that WikiWord's prevChar is 
whitespace?

if( prevChar && !/\s/.test(prevChar) )

I'll look into this in more detail tomorrow.

Original comment by atte.kemppila on 20 Jul 2011 at 6:31

GoogleCodeExporter commented 9 years ago

http://fossil.wanderinghorse.net/demos/wikiwym/GoCoWi-previewer.html

that's got my local copy of the changes described above.

Original comment by sgbeal@googlemail.com on 20 Jul 2011 at 7:43

GoogleCodeExporter commented 9 years ago

comment 4: !WikiWord works for me. Can you give an example where !WikiWord 
doesn't work.

Anyway, I made test page: http://code.google.com/p/atte-sandbox/wiki/WikiSyntax

I also updated the patch to handle numbers in WikiWords.

With that patch, the only big bug I can see there is that WikiWord and 
!WikiWord doen't work is inside inline code block: `WikiWord`, {{{WikiWord}}}, 
`!WikiWord` and {{{!WikiWord}}}. Other than that, everything seems to work ok 
when compared to the actual page in Google.

Original comment by atte.kemppila on 21 Jul 2011 at 9:43

Attachments:

WikiWord.patch

GoogleCodeExporter commented 9 years ago

The above mention bug is caused by how inline code blocks are handled in 
parseLineVerbatim(). That is, special character are encoded so that later in 
the loop (where my WikiWord code is) the special characters are not recognized 
anymore.

Quick fix would be to change line 446:

// replace(/[!_*,^~\[]/g, function($0) { return '&#0'+$0.charCodeAt(0)+';'; })
replace(/[A-Z!_*,^~\[]/g, function($0) { return '&#0'+$0.charCodeAt(0)+';'; })

Now my test page should work just fine. But IMHO, that and actually the whole 
parseLineVerbatim() is a bit of a hack. Any reason ` and {{{ is not handled in 
the same for loop as WikiWord, !WikiWord etc.? Any thoughts how this should be 
fixed? 

And are there any other bugs related to this WikiWord patch?

Original comment by atte.kemppila on 21 Jul 2011 at 9:47

GoogleCodeExporter commented 9 years ago

@#10: it's in the weird case `!``WikiWord`, which is used one time on the 
SupportedSyntax page.

Yes, parseLineVerbatim() is definitely a hack. The reason for handling inlined 
{{{ }}} separately from block-level {{{ }}} had to do with a syntactic 
ambiguity, IIRC. The results of:

{{{
line of code;
}}}

{{{line of code}}} are much different (IIRC).

i will try this fix later on, but i've been awake for about 30 hours now and i 
can't concentrate on it :). i appreciate you taking the time to patch this :). 
i'll post my results here.

Original comment by sgbeal@googlemail.com on 21 Jul 2011 at 10:06

GoogleCodeExporter commented 9 years ago

`!``WikiWord` works if you use the patch from comment 9 and the quick fix from 
10. I combined those in the included patch here just to make it clear.

Regarding those inline and block-level {{{ }}}. Isn't quite easy to tell the 
difference. If "{{{" is at the beginning of the line and there's nothing after 
it, it's block-level. Otherwise (not at the beginning of the line and/or 
something following it) it's inline. Or I'm missing something here.

Original comment by atte.kemppila on 21 Jul 2011 at 4:59

Attachments:

WikiWord_parseLineVerbatim.patch

GoogleCodeExporter commented 9 years ago

Ping! i have not forgotten about this, i just haven't gotten around to playing 
with it. i am using wikiwym heavily in another project, so i will be certain to 
eventually getting around to this. My apologies for the delay.

Original comment by sgbeal@googlemail.com on 16 Aug 2011 at 5:57

GoogleCodeExporter commented 9 years ago

An alternative, if Fabien doesn't mind (he's the one with the admin rights): we 
could add Atte to the commit list.

@Atte: assuming you would like to commit this yourself, i think the only 
"unwritten rule" we have so far regarding changes is that the 
SupportedWikiSyntax wiki page should work. My only problem with the patch so 
far is the incorrect wiki-linking of the weird construct i demonstrated above. 
i haven't yet tried the 2nd patch (which reportedly fixes that).

@#12: ^{{{ can also be a part of:

{{{inline block}}} non-block code

so i think the transformation of {{{...}}} to `...` is still semantically sound 
(but it IS a hack).

Original comment by sgbeal@googlemail.com on 16 Aug 2011 at 6:03

GoogleCodeExporter commented 9 years ago

And to correct/expand part of that last comment (because i didn't answer the 
question i was trying to answer): ^{{{ can also be a part of...

If we delay the determination until later (as suggested in #12), we have to 
buffer and backtrace more, which opens up more room for errors. We try to do 
forward-only parsing (there might be a couple exceptions to this, though). The 
regex-based parts of the parser are "quick/easy hacks" (i am allowed to say 
that because i put them there ;). i would prefer to have a true byte-by-byte 
parser with as little back-tracking as possible. But the current code works 
well for everything i use it for, so i have had no inspiration to go back and 
"fix" it (it ain't broken, just not 100% how i'd prefer to see it).

Happy Hacking!

Original comment by sgbeal@googlemail.com on 16 Aug 2011 at 6:17

PiRSquared17 / wikiwym

Possible improvements for WikiWord linking #13