philss / floki

Floki is a simple HTML parser that enables search for nodes using CSS selectors.
https://hex.pm/packages/floki
MIT License
2.07k stars 156 forks source link

Selectors that include semicolons are treated as pseudo class selectors #172

Closed merongivian closed 6 years ago

merongivian commented 6 years ago

First of all thanks for the awesome library, works great for parsing xml

Im trying to use find for getting a node that a has a colon, in this example i want to get yt:channelId

<feed xmlns:yt="http://www.youtube.com/xml/schemas/2015" xmlns:media="http://search.yahoo.com/mrss/" xmlns="http://www.w3.org/2005/Atom">
  <link rel="self" href="http://www.youtube.com/feeds/videos.xml?user=google"/>
  <id>yt:channel:UCK8sQmJBp8GCxrOtXWBpyEA</id>
  <yt:channelId>UCK8sQmJBp8GCxrOtXWBpyEA</yt:channelId>
  <title>Google</title>
</feed>

But when i try i Floki.find("yt:channelid") i get: [warn] Pseudo-class "channelid" is not implemented. Ignoring

Is there a way to ignore the colon, so it doesn't think its a pseudo class selector?

mischov commented 6 years ago

@merongivian Try Floki.find(xml, "channelid") to select on just the tag, or Floki.find(xml, "yt|channelid") to select the tag with namespace.

The colon in the tag indicates a namespace.

merongivian commented 6 years ago

@mischov using the | instead of the colon worked, thanks 👍