Open SimonSchmid opened 4 years ago
Is this something that will be addressed anytime soon?
Hi, we are a student group and we would like to fix this bug. Can't guarantee that we are able to fix it but we would like to have a try.
Hi @SimonSchmid. I am an undergraduate student. One of my courses this semester related to Software Engineering requires us to fix issues on Github.
I can understand the first case, but I am confusing with the second case "one within an attribute". May I ask what is the expected output for the second case? Could you explain a little bit about "The second case is handled by adding the xlink namespace to the html tag."? Thank you very much.
The second case that I understand is xlink:href="UnboundPrefix"
. So you want to access the value UnboundPrefix
with the name xlink:href
, right?
I am currently working on converting :
to Unicode so that Jsoup can give the name containing it for the first case. But I may need more information about the second case.
I now understand what you want for the second case from the link you provided https://html.spec.whatwg.org/#coercing-an-html-dom-into-an-infoset. You may want to search the attribute by the key "xlinkU00003Ahref" rather than "xlink:href". Please take a look at PR #1682.
Hi @SimonSchmid, sorry for the late reply on this. Can you give more detail / an example on what you want to do with the xpath selector and how you're interacting with that. I want to make sure I understand the use case correctly.
In #1801 we disabled the namepath for elements when running through the xpath selector, for general convenience.
So e.g. el.selectXpath("//h1")
finds the first example. See example.
Hello, I want to report an issue I am having with jsoup. I have not found a similar issue, so I am creating a new one.
I created a toy example that illustrates the issue:
This webpage contains two unbound prefixes, one in within a tag and one within an attribute. Jsoup does not handle these according to https://html.spec.whatwg.org/#creating-and-inserting-nodes and https://html.spec.whatwg.org/#coercing-an-html-dom-into-an-infoset. There it says, the first case (tag) should be handled as follows:
<test:h1>
becomes<testU00003Ah1>
. The second case is handled by adding thexlink
namespace to the html tag.Without the unbound prefixes being fixed, I have issues using XPath. It would be nice if jsoup handles such cases.
Regards, Simon