ocramz / xeno

Fast Haskell XML parser
Other
120 stars 33 forks source link

Allow spaces around equals in attributes #19

Closed unhammer closed 6 years ago

unhammer commented 6 years ago

Things like

    <o m   = "100"
       gee =   "0">

are well-formed and appear in the wild.

unhammer commented 6 years ago

(force pushed; didn't notice all those updates today :))

unhammer commented 6 years ago

By the way, it seems to accept zero-length attribute keys (before this change too):

λ> fold const (\m _ _ -> m + 1) const const const const 0 "<o ='2' ='1'></o>"
Right 2

According to xmllint, this isn't well-formed. I don't know if it's within xeno's goals to strictly follow the standards here?

chrisdone commented 6 years ago

How does this affect the performance? Xeno is mostly a performance effort.

It's up to @ocramz whether it should be lenient or strict.

unhammer commented 6 years ago

How does this affect the performance? Xeno is mostly a performance effort.

Bench results are a bit confusing when I test, even after exiting anything I could find using cpu in htop: https://gist.github.com/unhammer/f3d03362f93d89bb363723995e722dee

(I'm testing on lts-6.35 since that's what we're on at work, don't know if that matters)

ocramz commented 6 years ago

I'd like to properly follow the performance policy @chrisdone .

The easy way is to add a stack bench to the travis file, but then one has to monitor all changes "by hand". Is there a way to keep golden timing tests?

chrisdone commented 6 years ago

The process I was using was whenever a PR comes in, the commit should have a before / after of the speed tests.

I think anything less than that would let speed regressions slip in without you noticing. At least if the commit contains the numbers, it's easy to track the history of performance.

unhammer commented 6 years ago

the commit should have a before / after of the speed tests.

Reworded the commit message with benchmark results (from my other computer, got a bit less variance there it seems). I don't know if it's worth the minor slowdown for others – if we decide to switch to xeno, we could always keep a fork, but maybe there are also others who have to parse xmls with such unnecessary whitespace.

unhammer commented 6 years ago

Uh does anyone understand the CI failure? Never seen that before :-)

ocramz commented 6 years ago

Haven't had time to investigate the weird behaviour on Travis, perhaps I should just remove stack bench from the travisfile.

On Tue, May 22, 2018 at 9:24 AM, Kevin Brubeck Unhammer < notifications@github.com> wrote:

Uh does anyone understand the CI failure? Never seen that before :-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ocramz/xeno/pull/19#issuecomment-390889459, or mute the thread https://github.com/notifications/unsubscribe-auth/AFoRqEnZMeFr4noCm89hU7FB2sM_mOoqks5t070ngaJpZM4T6bMU .

ocramz commented 6 years ago

For the time being, I'll merge this into a feature branch so that we can test with various configurations before merging into master.