Original report by Adam Victor Brandizzi (Bitbucket: brandizzi, GitHub: brandizzi).

The animator object is still on a fluid state but we already have some ideas on how it should be organized.

The returned elements

The elements it returns should follow the ElementTree API. They can be a superset of ET.Element but should always try to follow some existing implementation (e.g. lxml).

Rationale: ElementTree is a simple and well-known API.

The parser objects

The animator should receive a parser as one of its constructor arguments.

A parser should have two methods (parse_string() and parse_file()) which returns an object implementing ElementTree API.

This ET object will probably not be one already implemented by an existing parser. Instead some parser (such as xml.etree.ElementTree, lxml.etree or BeautifulSoup) will be used to parse the XML data and then generate a proxy value (the ShadowElement we already created).

The parser should also provide methods for searching. So far, it can be something like parser.find_by_xpath(), parser.find_by_class() etc. If some parser does not support a search method (e.g. a xml.etree.ElementTree-based parser would not support a find_by_css_selector()method) it does not to be implemented, it does not to be implemented. The animator should have find() and take() methods which would map to these parser methods—i.e. animator.take(xpath='...') would call parser.find_by_xpath('...'). The returned element (from the parser) should map to an ET element of the animator.

Rationale One may want to use different parsers—for example, lxml if efficiency is crucial, or html5llib for a more complete HTML 5 support, or BeautifulSoup for a laxer parser. Also, we need to provide a widely available implementation. Such implementation could rely on xml.etree.ElementTree or HTMLParser. These parsers can be quite limited, however, so one should be able to easily replace them.

The renderer objects

The animator should receive a renderer as one of its constructor arguments.

A renderer should have a method, tostring(), which expects an ElementTree-like document or element. Then it will return a string representing its argument.

The animator should also have a tostring() method, with one optional argument. If the argument is not given, then the constructor's renderer is used to return a string representing the animator's document; if the argument is indeed given, then it should be a renderer and the it should be used instead. The __str__() method from the animator should call tostring() with no argument.

Rationale: HTML has many incarnations: HTML 4.11, XHTML, HTML5 etc. Others will undoubtedly arise. We should be able to use whichever we prefer.

Conclusion

Those are some ideas about how to implement Golem. They are quite complicated, for sure, but will make our work viable. #4 and #6 are even blocked by it. Much more will need to be defined, but those are nice first steps.

brandizzi / golem

Animator's ins and outs: a draft of the animator's API and infrastructue #7

The returned elements

The parser objects

The renderer objects

Conclusion