Closed Witiko closed 2 months ago
I took a closer look at the depth_first_search
function, and while I can't pinpoint exactly where it goes wrong, I wrote a simpler, recursive alternative that seems to work for me.
local function depth_first_search(node, path, visit, leave)
visit(node, path)
for label, child in pairs(node) do
if type(child) == "table" then
depth_first_search(child, path .. label, visit, leave)
else
visit(child, path)
leave(child, path)
end
end
leave(node, path)
end
It needs to be called with an empty string as its second parameter:
depth_first_search(prefix_tree, "", function ...
Since I can't edit this PR directly, feel free to commit this change, and let's see what the CI tests have to say about it.
Thanks, simpler is better. I will test out your code on Monday at the latest. Perhaps we can get rid of the leave()
call for leaf nodes as well, since we don't need it.
Since I can't edit this PR directly, feel free to commit this change, and let's see what the CI tests have to say about it.
As far as I know, you can create your own fork of this repository and open a second-order PR that would merge into the branch feat/improve-speed
from this PR.
However, you don't need to: With your permission, I will use your code from https://github.com/Witiko/markdown/pull/482#issuecomment-2293428688 and commit it with a by-line that states that it is your contribution. This way, GitHub will register the commit as your contribution as well.
Following a successful CI run, this PR should repeat the experiment from #474 to see if this has sufficiently improved speed the speed of the Markdown package. If so, then we should also make the following changes:
- Test that the
markdown-cli
command finishes under 1 second in the CI.
I verified on my Dell G5 15 notebook that after commit 6eff522 from this PR, the markdown-cli
command finishes under 1 second:
$ docker run --rm -i ghcr.io/witiko/markdown:3.7.0-dev.1-7-g6eff5224-latest-minimal-no_docs bash -c 'time markdown-cli <<< foo'
\markdownRendererDocumentBegin
foo\markdownRendererDocumentEnd
real 0m0.258s
user 0m0.204s
sys 0m0.053s
In commit 3e8fe8ad, I added a test that the markdown-cli
command finishes under 1 second to the CI, so that we can catch similar speed regressions in the future.
If not, this PR should continue by making the following changes:
- Precompile
parsers.punctuation
to a separate Lua file.- Remove dependency on
UnicodeData.txt
.These changes will close https://github.com/Witiko/markdown/issues/458 and continue https://github.com/Witiko/markdown/issues/402.
Since updating the parser has been sufficient for fixing the speed issue, I postponed the removal of the dependency on UnicodeData.txt
to the next release, see also ticket https://github.com/Witiko/markdown/issues/486.
This PR makes the following changes:
parsers.punctuation
using prefix trees.This change will close #474.
Following a successful CI run, this PR should repeat the experiment from https://github.com/Witiko/markdown/issues/474 to see if this has sufficiently improved speed the speed of the Markdown package. If so, then we should also make the following changes:
markdown-cli
command finishes under 1 second in the CI.If not, this PR should continue by making the following changes:
parsers.punctuation
to a separate Lua file.UnicodeData.txt
.These changes will close https://github.com/Witiko/markdown/issues/458 and continue https://github.com/Witiko/markdown/issues/402.