Closed Yasumoto closed 6 years ago
@Yasumoto Hi! Thanks for contributing 😃
Care to discuss these with me?
As I see it, innerXMLContentsAccumulator
should not exist. It doesn't need to. Do you think it's possible for us to get rid of it? Here's my reasoning:
self.currentXMLDOMPath
is untouched with the correct path, calling mapCharacters
should be able to keep appending everything you throw at it, in the right place.There is an assumption that the XHTML will start with a div
element. The RFC doesn't specify this. XHTML can technically have a root element html
.
div
is parsing, we already know it should be handled as XHTML. the enclosing element of type="xhtml" already tells us this.Maybe we should update the link reference of the RFC specification and not xml.com.
RSS handles HTML/XHTML enclosed within CDATA tags or with encoded entities. Just something to keep in mind, because if it's parsing an element with type xhtml in an RSS feed, it will incorrectly assume that it doesn't already handle this situation. I wonder if we should make sure this is only valid for Atom feeds.
Ok, so this is just from the top of my head. Can you give your input on these?
Thanks again for contributing!
Cheers
Closing this for inactivity, for now.
Thanks @nmdias, I’ll pick this back up in June after some other projects get wrapped up 👍
What an amazing framework!
The content I'm parsing looks like it follows this format of including HTML within the content field, and this is a stab at parsing that.
Ran the tests, but totally open to feedback on improvements.
Thanks! 🙏