Closed mgeyer closed 7 years ago
@mgeyer I'll research about that for OX, but also take a look how Rexml and Nokogiri will work on this kind of situation. Response with proposals after. Shouldn't take much time.
@soulcutter What do you think about it?
@fanantoxa Thanks for taking a look :) I would hope that both Rexml and Nokogiri at least have support for throwing errors on parsing errors. If they do then we should make sure that either all adapters throw errors or none do.
I think it makes sense for Saxerator to implement the #error
method and raise an error.
@soulcutter, @mgeyer Nokogiri and Ox have a error calbek. I have to test manualy with REXML, since error
calback not documenter. Same for Added recently oga
parser
I'm not -too- concerned that all parsers raise errors (or none of them) - I think the expectation is that different parsers may have slightly different behavior based on what they provide. If we can raise an error, I would like to, though, since I think it's a better practice than being silent.
Agreed, or at least a way to configure the parsers to raise errors.
@mgeyer , @soulcutter Yep. Same for me. As I understand Oga raise and exception on error.
Didn't check REXML yet, but it might be the same.
I think the best idea here would be to catch this exceptions (or add error
callback for first 2 parser) and raise our exception like Saxerator::DataCorruptedException
. In this case any parser on error would behave the same.
How you look on that?
@mgeyer, @soulcutter Finished in master
Thanks for the hard work @fanantoxa :)
Hi There,
I was debugging some fault XML (invalid byte sequence for UTF-8) and noticed that Saxerator just ignored the error and stopped processing. It would process all nodes successfully up to that point, and then stop finding more elements.
At first I thought that was Saxerator having an issue, but upon closer inspection it turns out that is how the Ox SAX parser is configured by default. In short, unless an error method is defined for the Ox SAX parser it will silently close the remaining tags. You can check out (https://github.com/ohler55/ox/issues/166) to get more details.
How do people feel about adding the error method and raising an exception when an error is found by the Ox parser?