Closed jslegers closed 9 years ago
Hi! In my opinion, it is always a bad idea trying to extend the dom document classes.
We should have learned this with prototype.js. To avoid all problems with dom extension, jQuery decided to wrap always in a class.
$("span")
is just a shortcut to $("span", window.document)
. To port the same thing, the right approach would be new DOMQuery($domDocumentInstance)
.
the advantages of this approach are:
$domDocumentInstance
can be obtained in a wide range of ways (using html5, php domdocument, etc)DOMQuery
class.@goetas :
Actually, the approach you're suggesting _is_ the approach that I'm taking.
The project makes use of the following classes :
\PowerTools\DOM_Document
: This subclass of \DOMDocument
merely adds the ECMAScript 5 methods querySelector
and querySelectorAll
to \DOMDocument
(used for DOM selection) and improves the methods for saving to or loading from strings. It contains no DOM manipulation logic.\PowerTools\DOM_HTML
: This subclass of \PowerTools\DOM_Document
is merely a convenience abstraction for \PowerTools\DOM_Document
with no realy differences. So it also contains no DOM manipulation logic.\PowerTools\DOM_XML
: This subclass of \PowerTools\DOM_Document
is merely a convenience abstraction for \PowerTools\DOM_Document
. Right now, the only difference with \PowerTools\DOM_HTML
is the value of the flag $_isHTML
. So it also contains no DOM manipulation logic.\PowerTools\DOM_Query
: This class contains all the DOM manipulation logic. Running the constructor of of \PowerTools\DOM_Query
adds an instance of \PowerTools\DOM_Document
to public property $this->DOM
. When a method of \PowerTools\DOM_Query
returns another \PowerTools\DOM_Query
, it merely copies the reference to $this->DOM
and thus keeps using the same \PowerTools\DOM_Document
to keep overhead as minimal as possible and flexibility as maximal as possible.\PowerTools\ DOM_Helper
: This class contains just a few static helper methods used by \PowerTools\DOM_Query
.1) putting the css query selector inside the document is a bad idea. even the domxpath is a separate class.
2) subclassing is different form wrapping
3) DOM_Query::_construct
should take the domdocument as input, instead instantiating the new one
1) putting the css query selector inside the document is a bad idea. even the domxpath is a separate class.
In JavasScript, querySelectorAll
and querySelector
are methods of the DOMDocument and DOMElement objects. As the purpose of this project is to copy the behavior of jQuery, it makes sense to also copy this behavior.
Also, I see little benefit of not including them inside the document object. The actual CSS selection logic is taken case of by Symfony's CssSelector component, with the querySelectorAll
method acting as an abstraction wrapper around it and the querySelector
method as an abstraction wrapper around querySelectorAll
.
public function querySelectorAll($selector, $contextnode = null) {
if ($this->_isHTML) {
CssSelector::enableHtmlExtension();
} else {
CssSelector::disableHtmlExtension();
}
$xpath = new \DOMXpath($this);
return $xpath->query(CssSelector::toXPath($selector, 'descendant::'), $contextnode);
}
public function querySelector($selector, $contextnode = null) {
$items = $this->querySelectorAll($selector, $contextnode);
if ($items->length > 0) {
return $items->item(0);
}
return null;
}
2) subclassing is different form wrapping
Valid point. Where I say "wrapper", I actually mean to say "abstraction". I'll fix that in my previous post.
Note that I don't really distinguish between wrappers and subclasses in cases where these are used for abstracting and simplifying an interface. They're just different ways to achieve the same.
3) DOM_Query::_construct should take the domdocument as input, instead instantiating the new one
Actually, \PHPPowerTools\DOM_Query::_construct
accepts both a string or an instance of \PHPPowerTools\DOM_Document
as valid input.
public function __construct($source, $isHtml = true) {
if (is_string($source)) {
if ($isHtml) {
$this->DOM = new DOM_HTML($source);
$this->isHtml = true;
} else {
$this->DOM = new DOM_XML($source);
$this->isHtml = false;
}
} else {
$this->DOM = $source;
$this->isHtml = $isHtml;
}
$this->nodes = array($this->DOM);
}
I see no valid reason for only allowing an instance of \PHPPowerTools\DOM_Document
. On the other hand, there are several reasons for allowing both :
// Define your DOMCrawler
$ = jQuery;
// Passing a string (CSS selector)
$s = $( 'div.foo' );
// Passing an element object (DOM Element)
$s = $( document.body );
// Passing a jQuery object
$s = $( $('p + p') );
namespace PowerTools;
// Get file content
$htmlcode = file_get_contents( 'https://github.com' );
// Define your DOMCrawler based on file string
$H = new DOM_Query( $htmlcode );
// Define your DOMCrawler based on an existing DOM_Query instance
$H = new DOM_Query( $H->select('body') );
// Passing a string (CSS selector)
$s = $H->select( 'div.foo' );
// Passing an element object (DOM Element)
$s = $H->select( $documentBody );
// Passing a DOM Query object
$s = $H->select( $H->select('p + p') );
I fixed the problem.
For this, I needed to two new options.
These options are implemented only in \Masterminds\HTML5\Parser\DOMTreeBuilder
and have the following purpose :
implicitHtmlNamespace
= Allows the use of createElement instead of
createElementNS for HTML elements. This is required for compatibility
with \PHPPowertools\DOM-Query
and for compatibility with \Symfony\Component\CssSelector\CssSelector
.target
= allows an existing DOMDocument (or subclass thereof) to be
passsed to the DOMTreeBuilder instead of creating a new one. This
option is required for compatibility with \PHPPowertools\DOM-Query
I created a new pull request with the two changes -> https://github.com/Masterminds/html5-php/pull/69
_PHPPowertools/DOM-Query is the first component of the **PHPPowertools_ framework that has been released to the public. It's purpose is similar to that of technosophos/querypath** but it's implementation is far more true to both jQuery's syntax and its semantics. For example, _PHPPowertools/DOM-Query_ lets you do stuff like this :
What's lacking so far, is proper support for HTML5. I've been considering using _Masterminds/html5-php_ to do the DOM parsing.
The most elegant way to implement the feature, would be by adding a
target
option to the supported options for\Masterminds\HTML5\Parser\DOMTreeBuilder::__construct
with support for following datatypes :\DOMDocument
or subclasses of\DomDocument
\DOMImplementation
or subclasses of\DOMImplementation
I would like to use this feature as follows :
I've tried adding a simple
if(){}else{}
statement to\Masterminds\HTML5\Parser\DOMTreeBuilder::__construct
to replace$this->doc
with$options['target']
if a value for$options['target']
has been set, but that doesn't seem to do it.As an alternative, I've also considered reïmplementing
\PowerTools\DOM_Document
as a subclass of\DOMImplementation
, but this is a far less elegant approach that introduces too many new issues to go any further in that area.Any feedback would be appreciated!
See also https://github.com/PHPPowertools/DOM-Query/issues/1