Closed dwarring closed 5 years ago
profiling doesn't shed much light:
<anon> gen/moar/stage2/NQPHLL.nqp 1904 1 241090363 27
<unit-oute t/40reader_mem_error.t 1 1 241090335 874
ACCEPTS SETTING::src/core.c/Bool.p 27 1 30362 437
postcircum SETTING::src/core.c/array_ 102 1 29867 783
name SETTING::src/core.c/Parame 239 1 21720 101
sub_signat SETTING::src/core.c/Parame 616 1 21575 49
ACCEPTS SETTING::src/core.c/Mu.pm6 19 1 21526 102
Very large gap between '
Added some trace statements. Seems to be the first call of xmlTextReaderRead() that takes all the time.
Most likely libxml2 related. Update: further tracked down to xmlParseTryOrFinish()
David,
After zef install .
a couple of your iterations, I observed the parse performing like a lightning bolt, but every text node request seemed to sleep for .1 or .2 seconds. I didn’t mention it then because you’re covering so much ground at the time and I knew that it could wait.
Yes, I confirm that .getChildrenByTagName($TAG) was dragging a bit. My app reads 97,000+ of them in one go, so it was noticable. Very glad that it is in your sights now.
Thanks,
Mark
From: David Warring notifications@github.com Sent: Monday, September 2, 2019 15:58 To: p6-xml/LibXML-p6 LibXML-p6@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [p6-xml/LibXML-p6] t/40reader_mem_error.t sometimes slow (#14)
Added some trace statements. Seems to be the first call of xmlTextReaderRead() that takes all the time.
Most likely libxml2 related.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/p6-xml/LibXML-p6/issues/14?email_source=notifications&email_token=AHRJF2LQY62P526ATYSRACTQHVV5HA5CNFSM4ISZEOAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5WPIIA#issuecomment-527234080, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHRJF2LT6VHSXW6V3A7GAYLQHVV5HANCNFSM4ISZEOAA.
Correction:
@xml-nodes[0].textContent was where I was seeing the delay
Mark
From: Mark Devine Sent: Monday, September 2, 2019 17:28 To: p6-xml/LibXML-p6 reply@reply.github.com Subject: RE: [p6-xml/LibXML-p6] t/40reader_mem_error.t sometimes slow (#14)
David,
After zef install .
a couple of your iterations, I observed the parse performing like a lightning bolt, but every text node request seemed to sleep for .1 or .2 seconds. I didn’t mention it then because you’re covering so much ground at the time and I knew that it could wait.
Yes, I confirm that .getChildrenByTagName($TAG) was dragging a bit. My app reads 97,000+ of them in one go, so it was noticable. Very glad that it is in your sights now.
Thanks,
Mark
From: David Warring notifications@github.com<mailto:notifications@github.com> Sent: Monday, September 2, 2019 15:58 To: p6-xml/LibXML-p6 LibXML-p6@noreply.github.com<mailto:LibXML-p6@noreply.github.com> Cc: Subscribed subscribed@noreply.github.com<mailto:subscribed@noreply.github.com> Subject: Re: [p6-xml/LibXML-p6] t/40reader_mem_error.t sometimes slow (#14)
Added some trace statements. Seems to be the first call of xmlTextReaderRead() that takes all the time.
Most likely libxml2 related.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/p6-xml/LibXML-p6/issues/14?email_source=notifications&email_token=AHRJF2LQY62P526ATYSRACTQHVV5HA5CNFSM4ISZEOAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5WPIIA#issuecomment-527234080, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHRJF2LT6VHSXW6V3A7GAYLQHVV5HANCNFSM4ISZEOAA.
@markldevine will have a look at .getChildrenByTagName / .textContent
^^ @markldevine set if that optimisation helps. I'm getting some benefit on a smaller doc.
$ perl6 -I . -M LibXML -e 'my $xml-manager = LibXML.new; for (1..3) {my $start = now; my $doc = $xml-manager.parse(:io("etc/libxml2-api.xml")); $doc[0].textContent; note "fetch: " ~ now - $start;}'
fetch: 0.12039373
fetch: 0.0784591
fetch: 0.0790274
First fetch is down from about 0.18
Problem was just slightly deeper: The 'I/O errors' were the symptom of a real problem. :!validation, :!load-ext-dtd reader flags weren't being correctly handled. So the test was OK as is. Fix was needed in LibXML::Reader options processing.
Completing and passing, but can take an unusually long time. CPU mostly idle.