Currently running into an issue where a marked-up document is not having its items found, because they all include an itemprop attribute. This prevents them from being selected when Mida::Document#extract_items is called from its constructor, because it explicitly searches for //*[@itemscope and not(@itemprop)]
I'm pretty sure the reason for this is to make sure you're only grabbing the top level of items in a page, but as far as I can tell, it's valid to have an itemprop on a top-level object. On this page, for example, the top level VideoObject has itemprop="video" on it. This page is not rejected by any validator that I've tried on it so far.
I'm currently working around this with the following terrible monkeypatch:
module Mida
class Document
private
def extract_items
itemscopes = @doc.search('//*[@itemscope]')
return nil unless itemscopes
# strip out descendents - we only want the top level
itemscopes = itemscopes.select do |item|
item.ancestors('//*[@itemscope]').blank?
end
itemscopes.collect do |itemscope|
itemscope = Itemscope.new(itemscope, @page_url)
Item.new(itemscope)
end
end
end
end
This works the way I want it to, but I think having to check the ancestors for each hit is horrible. I'm hoping there's a better way to do this.
It's not clear to me that this gem is even maintained anymore, but still putting this up here in hopes someone has a better idea, since this gem provides like everything else I need. It's just this initial select that's issuematic.
Hi,
Currently running into an issue where a marked-up document is not having its items found, because they all include an
itemprop
attribute. This prevents them from being selected whenMida::Document#extract_items
is called from its constructor, because it explicitly searches for//*[@itemscope and not(@itemprop)]
I'm pretty sure the reason for this is to make sure you're only grabbing the top level of items in a page, but as far as I can tell, it's valid to have an
itemprop
on a top-level object. On this page, for example, the top levelVideoObject
hasitemprop="video"
on it. This page is not rejected by any validator that I've tried on it so far.I'm currently working around this with the following terrible monkeypatch:
This works the way I want it to, but I think having to check the ancestors for each hit is horrible. I'm hoping there's a better way to do this.
It's not clear to me that this gem is even maintained anymore, but still putting this up here in hopes someone has a better idea, since this gem provides like everything else I need. It's just this initial select that's issuematic.