Closed evazion closed 5 years ago
To add classes that require database lookups, couldn't a secondary parser be added to the mix, that would take the output of the DText renderer and adds the classes that require database lookups...?
Just a note that entire site is moved to ragel dtext gem, so it can now be solved with pull request to https://github.com/r888888888/dtext_rb
Re: secondary parser, I think that would pretty much defeat the purpose, since, AFAIK, the primary objective of ragel parser is to significantly speed up parsing and reduce object garbage count by moving the parsing to external C library. That said, perhaps it can be coded to accept callbacks from the caller and use them to retrieve info from the database. Or maybe secondary parser wouldn't generate so much objects, though I doubt that.
That's not the only advantage of the ragel parser. Since it's token based it can parse elements based on context so things like inline spoiler tags are finally possible.
As for embedding tag data, it's certainly possible. The c code can make calls out to ruby code. I worry about all the extra calls being made though. Keep in mind wiki pages are embedded in every tag search.
An attempt at this:
http://devbooru.evazion.ml/wiki_pages/23837 https://github.com/evazion/dtext_rb/commit/dd82553f52a5aeaeb5de9555dc365e4645d08a1b https://github.com/evazion/danbooru/commit/7431dbf72039bd8fdc84e772d0d3e9e0795e0003
The approach is to pass a ruby block to the parser, which the parser calls whenever it encounters a wiki link. The block looks up the tag's type and generates the link. Looks like this:
DTextRagel.parse(text) { |type, str|
if type == "basic_wiki_link"
tag = Tag.normalize_name(str)
n = Tag.category_for(tag)
%(<a class="dtext-link dtext-wiki-link tag-type-#{n}" href="/wiki_pages/show_or_new?title=#{u tag}">#{h str}</a>)
end
}
It works, but it is indeed slower for pages with large numbers of links - parsing list_of_original_characters
takes ~600ms. I think caching calls to dtext would mitigate this, but I don't know if memcaching everything would create too much memory pressure in memcached.
Can you investigate what exactly slows things down? Repeatedly switching back to Ruby to execute the block, object creation, or calling the Tag
functions? The thing is, I'm pretty sure tag categories are already memcached since they are being pulled on every post display, comment listing etc., it's strange if fetching them would slow things down that much.
@evazion Does this conflict at all with r888888888/dtext_rb@ 12fe26c...?
Regardless, I checked out topic #13272 with around ~6000 wiki links and it was excessively slow. Admittedly, that is probably the extreme of extremes, nevertheless it's a good test page to determine how any different optimizations may help.
I also noticed with the above that ID'd wiki links are not processed, i.e "/wiki_pages/12345". Is that something that could be added?
Can you investigate what exactly slows things down? Repeatedly switching back to Ruby to execute the block, object creation, or calling the Tag functions?
Haven't looked into it very deeply yet, but I think the main culprit is looking up the tag data. Tag types are memcached, but the problem is that doing 1000 calls to memcache to look up types one-by-one adds up. A single call takes 0.25ms - 0.3ms on my machine, so for 1000 tags that's 300ms.
Batching all the memcache lookups together into one big call to memcache would probably help, but that would take two parsing passes, one pass to find all the tags in the document and look them up together, then a second pass to take this data and generate the html. This would need some refactoring to arrange for the first pass to not generate any output, otherwise it will be slowed down by generating html that isn't used.
There are definitely inefficiencies in how objects are allocated and in the switching between C and Ruby, but I think the bottleneck is the large number of calls to memcache.
@evazion Does this conflict at all with r888888888/dtext_rb@ 12fe26c...?
No, it's based off of that commit. If you're looking at my master branch it's out of date, this work lives in this branch: https://github.com/evazion/dtext_rb/commits/feat-colorize-wiki-links.
I'll have it handle ID wiki links too, just haven't gotten that far yet.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
The original issue was fixed. Colorizing DText tags is moved to #4112.
Links generated by DText should get CSS classes. This is useful for custom CSS and userscripts. I suggest the following:
{{hatsune_miku}}
-class="dtext-link dtext-post-search-link"
[[hatsune miku]]
-class="dtext-link dtext-wiki-link"
"example":http://www.example.com
-class="dtext-link dtext-external-link"
post #1234
-class="dtext-link dtext-id-link dtext-post-id-link"
topic #1234
-class="dtext-link dtext-id-link dtext-topic-id-link"
forum #1234
,pool #1234
, etc.I also propose that external links be identified with an icon like on Wikipedia. There was discussion of this in the custom CSS thread but I think it should be in the base site.
Finally it would be nice to add classes to
[[hatsune miku]]
links for post count and tag type, so that tags get colorized in the forum and the wiki. The problem is getting the tag type and post count require database lookups and that's a no-go in the DText parser. I do it in javascript in my userscript but I don't know if that's acceptable for the site itself.