Closed zyrolasting closed 4 years ago
This performance hit seems reasonable. You might also want to run and time the markdown test suite, which includes some performance tests.
Thanks for doing this work!
@stchang Happy to do it. Thanks for reviewing.
Alright, I added a test where the main thread interleaves stateful parsing with another thread. The main thread attempts to destroy the state via an implied user-state-reset!
. The test fails if the state is indeed reset mid-parse.
As for Markdown performance tests, they pass and this is what I see. (+cc: @greghendershott)
Strict mode:
Using /home/sage/Code/markdown/markdown/test/test.md
appended 5 times: 17398 chars and 974 lines. Doing 5 timings...
Timings: 87, 87, 90, 90, 91 (sorted)
Average: 89.0
Non-strict mode (with extensions):
Using /home/sage/Code/markdown/markdown/test/test.md
appended 5 times: 17398 chars and 974 lines. Doing 5 timings...
Timings: 101, 101, 104, 105, 105 (sorted)
Average: 103.2
[...]
raco test -x -s slow-test .
raco test: (submod "./markdown/random-test.rkt" slow-test)
Trying 500 docs with 300 "tokens" each, want < 5 sec: 1 51 101 151 201 251 301 351 401 451
As for Markdown performance tests, they pass and this is what I see
Do you have numbers from before this patch?
Ah, sorry. I ran both again since I can't remember if I had bytecode available for every module. For good measure, here's my console session for each test pass.
cd parsack
git co $BRANCH
raco setup parsack markdown
cd -
cd markdown
make test-all
I omitted the output from markdown
's test-slow
target since it's always the same.
master
Strict mode:
Using /home/sage/Code/markdown/markdown/test/test.md
appended 5 times: 17398 chars and 974 lines. Doing 5 timings...
Timings: 80, 80, 81, 82, 86 (sorted)
Average: 81.8
Non-strict mode (with extensions):
Using /home/sage/Code/markdown/markdown/test/test.md
appended 5 times: 17398 chars and 974 lines. Doing 5 timings...
Timings: 92, 92, 93, 93, 93 (sorted)
Average: 92.6
thread-cell
Strict mode:
Using /home/sage/Code/markdown/markdown/test/test.md
appended 5 times: 17398 chars and 974 lines. Doing 5 timings...
Timings: 87, 89, 91, 95, 99 (sorted)
Average: 92.2
Non-strict mode (with extensions):
Using /home/sage/Code/markdown/markdown/test/test.md
appended 5 times: 17398 chars and 974 lines. Doing 5 timings...
Timings: 105, 106, 107, 107, 107 (sorted)
Average: 106.4
Thanks! I think that is very reasonable.
I'm ready to merge but I'd like to wait to see if @greghendershott and @mbutterick want to weigh in.
Summary: parsack
(and thus markdown
) is broken in some multi-threaded scenarios. This patch fixes it, at the cost of what looks like a ~10% slowdown in some benchmarks.
You mean, do I care if Markdown users must absorb the costs of their life choices? Hell no.
@stchang Note that the times I posted are for 7.6 CS, not 3m. If you want, run the session again if you have a 3m installation handy--unlike me.
I agree with @mbutterick this seems like reasonable ~punishment~ performance for markdown users.
Seriously, as someone who often uses markdown for my blog (as well as org and scribble for other purposes) I think this is acceptable.
@zyrolasting Thanks again for investigating and fixing this!
So long as @mbutterick is happy, I'm happy.
Re: https://github.com/stchang/parsack/issues/50#issuecomment-590435790
Adding thread cells do have an impact on performance. These timings below are derived by running
parse-markdown
as bytecode on a single 564K Markdown file on my machine.master
:thread-cells
:I'm currently unsure about how to write a reliable test that verifies that no two threads interfere with each other. I think what I might do is create a user state with a counter, and confirm that 2 threads started at different times each parse 1000 characters and end up on position 1001 or something.
Obviously do not approve until I get a test in, but I want to know if you'd want a global switch to let the user specify shared state. That might save the performance, but it makes the interface a little more awkward ("Oh, are you multithreaded? Configure the library first")