Open nickmcintyre opened 1 month ago
Hi,
I found the bug,
If you run this correct code in here it works, I just did changes without setting up the code in my machine.
let myXML;
// Load the XML and create a p5.XML object. function preload() { myXML = loadXML('assets/animals.xml'); }
function listChildren(xml) { const arr = []; for (let i = 0; i < xml.DOM.childNodes.length; i++) { // Only include element nodes, filter out text nodes if (xml.DOM.childNodes[i].nodeType === 1) { // 1 is the nodeType for element nodes arr.push(xml.DOM.childNodes[i].nodeName); } } return arr; }
function setup() { noCanvas();
// Get the names of the element's children as an array. let children = listChildren(myXML);
// Print the array to the console. // Desired: ["mammal", "mammal", "mammal", "reptile"] // Actual: ["#text", "mammal", "#text", "mammal", "#text", "mammal", "#text", "reptile", "#text"] print(children); }
If you want to do full code changes in repository you need to make changes in these places.
correct code :
listChildren(xml) { const arr = []; for (let i = 0; i < xml.DOM.childNodes.length; i++) { // Only include element nodes, filter out text nodes if (xml.DOM.childNodes[i].nodeType === 1) { // 1 is the nodeType for element nodes arr.push(xml.DOM.childNodes[i].nodeName); } } return arr; }
correct code :
function setup() { noCanvas();
// Get the names of the element's children as an array. let children = listChildren(myXML);
// Print the array to the console. // Desired: ["mammal", "mammal", "mammal", "reptile"] print(children); }
👋
I don't see that this bug has been assigned to anyone or that a PR is up so I'd like to volunteer...unless @kris08052000 you were planning to do it?
I suspect the suggested solution of checking for nodeType
will work so I would follow through with that idea and also add a test to the existing suite.
I did read through the contributor docs, but I am new to this codebase and contributing community so apologies if I'm not following protocol in some way. ✌️
Simply filtering out #text
nodes will not work as #text
nodes are legitimate child nodes. For example
<?xml version="1.0"?>
<animals>
There are 4 animals here.
<mammal id="0" species="Capra hircus">Goat</mammal>
<mammal id="1" species="Panthera pardus">Leopard</mammal>
<mammal id="2" species="Equus zebra">Zebra</mammal>
<reptile id="3" species="Caretta caretta">Turtle</reptile>
</animals>
One would expect the #text
node containing the string "There are 4 animals here." to be one of the child nodes of <animals>
. The "extra" #text
nodes that shows up in this case is because of how whitespace is treated in XML (and HTML as well), in that any whitespace either space characters or new line characters still count as text in the tags, more here.
A possible way is to only ignore #text
nodes with only whitespace content but this implementation should first evaluate possible performance impact.
@limzykenneth thanks for the link and explanation about the string being considered a child.
So with listChildren()
, my question is, what is the intended behavior as to what counts as a child - is it all nodes (which include elements, strings and comments) or strictly elements?
Based on the language used in the Reference docs, it seems like only elements should be included as children.
Here'e the Reference description for listChildren()
method:
Returns an array with the names of the element's child elements as Strings.
How I understand the description, I do not expect that calling the method on this
<?xml version="1.0"?>
<animals>
There are 4 animals here.
<mammal id="0" species="Capra hircus">Goat</mammal>
<mammal id="1" species="Panthera pardus">Leopard</mammal>
<mammal id="2" species="Equus zebra">Zebra</mammal>
<reptile id="3" species="Caretta caretta">Turtle</reptile>
</animals>
will include "There are 4 animals here" in the returned array.
What's your take on the description and the intended behavior of the method?
@raquelcps For me, elements
is basically everything. What is being referred to as elements
here is more usually referred to as tags
as in HTML/XML tags, although the definition can drift.
Regardless, in terms of utility, I do think having #text
included is usually more useful than not.
@limzykenneth so the expected behavior is to include all nodes as children. Cool, thanks for clarifying.
Perhaps, separately from this bug fix, that could be made clearer in the Reference docs by adding an example that shows #text
as part of the returned array?
And for this bug, adding test scenarios for both situations seems like a good idea.
I'll see if I can get started on this over the weekend.
I think the blank #text
node probably still make sense to exclude, probably something like a regex that see if a #text
node's content is all whitespace or not, although I'm not sure if it will have significant performance impact with large enough XML file.
Most appropriate sub-area of p5.js?
p5.js version
1.9.3
Web browser and version
Chrome 124.0.6367.208
Operating system
macOS 14.4.1
Steps to reproduce this
Steps:
myXML.listChildren()
.Snippet:
Here's the sketch for reference.