paulirish / lite-youtube-embed

A faster youtube embed.
https://paulirish.github.io/lite-youtube-embed/
Other
5.68k stars 260 forks source link

Not Indexable by Google #105

Closed dwsmart closed 4 months ago

dwsmart commented 2 years ago

Videos embedded with aren't indexable by google, (unless there's also video schema).

If you add the iframe in a <noscript> when the component loads, google can detect that.

I can open a pull request if you think this is a useful addition?

paulirish commented 2 years ago

This feels odd and i'm already pretty skeptical about it. Can you give examples of what difference the indexability of a youtube embed would make?

dwsmart commented 2 years ago

Sure,

Plenty of people embed YouTube content on pages, and would like to appear on the video tab of search. example search (the video from my site appears here now, thanks to a <noscript> addition.

To appear here google needs to see an actual video url on the page, which using the facade approach it isn't there. The noscript inclusion satisfies that, and I imagine that for a number of publishers and creators would consider inclusion an important criteria, along with the brilliant speed improvements.

It's not the only method, including Video schema would do this too

Perfectly understand if you feel this is out of scope for this excellent library. Just feels like a nice addition to provide indexability for video search.

Garbee commented 2 years ago

Not sure how the <noscript> could be added dynamically if scripting is turned off. Google bot runs with scripting on now, so it still wouldn't see it. (Unless they are also running with it off to compare and pick it up then.)

I do think this could be a useful documentation addition as a consideration when using libraries such as this. Potentially emphasize the video schema since that is probably the best method to achieve this vs a noscript tag. Unless there is some quantifiable proof that the bot is picking up noscript content and still using it in the index. Even if that is verifiable, best to probably encourage devs to use that on their own if they want it indexed.

dwsmart commented 2 years ago

Google indeed are way ahead in rendering pages now, the issue is more how it interacts with a page, as the <iframe> is only introduced on click, that's an interaction they won't make when passing the page through the web rendering service.

They do parse stuff out of <noscript> tags in certain circumstances, they do it for images too if there's a poor lazyload solution on those (relying on scroll for example instead of intersection observers, they don't scroll either, they expand the viewport).

It's something I have tested and know to work, but no one likes a 'Dude, trust me!' from a rando on github. It's testable in you can set up a page without the <noscript>, a page with (I have a fork here that has the option I quickly put together for testing), wait for them to get indexed and do a site: search for the pages on the video tab.

All that being said, I do agree with you really, it guess it feels 'hacky' and probably is 'hacky', and really relies on undocumented behaviour from Google's rendering pipeline, an implementation detail that could change without notice.

Video schema, and documentation, probably is the correct solution here.

Garbee commented 2 years ago

relies on undocumented behaviour from Google's rendering pipeline

This is exactly why I wouldn't just<noscript> it. Without a solid confirmation of some kind from the search team that this is an acceptable method to use, avoid it. They in the future could very well determine this could be getting used in some way to get spam out more or mis-used in other ways and then changed.

Video schema is a direct documented thing that should achieve the result. So, that would be what we should inform people of. Pretty sure we can't go adding that dynamically. It would need to be dev-driven in their initial content. If we could script that and have it pick up, then there is a pathway to discuss getting in.

paulirish commented 2 years ago

It's testable in you can set up a page without the <noscript>, a page with (I have a fork here that has the option I quickly put together for testing), wait for them to get indexed and do a site: search for the pages on the video tab.

can you show screenshots from search results or something that demonstrates the difference it makes?

dwsmart commented 2 years ago

Sure, noscript

That highlighted video in the video search tab is there because of the <noscript> (just manually added in this case) It was not there until I added this, and the page reindexed.

I've set up 2 test pages With noscript and without it, I've submitted these for indexing as a more direct test, as soon as they are (google can be reluctant to rush to index test pages these days!) I'll report back with a like for like comparison.

dwsmart commented 2 years ago

Test pages now indexed, both show in a web site:https://testing.tamethebots.com/lite/ results: both-indexed But only the <noscript> version is indexed for video search only-noscript-video-search

benj-p commented 2 years ago

Thanks a lot @dwsmart! I've observed a large drop in traffic following the implementation of Lite YouTube Embed, so I'm now using your forked version while I implement Video Schema.

HunterJacobs commented 1 year ago

Implementing this would be very helpful! Worried about losing traffic as my pages are being de-indexed. image

dwsmart commented 1 year ago

@Jacohun1, I would concider going the schema root where possible. This online tool classyschema.org makes it very easy to create Video markup from YouTube videos.

HunterJacobs commented 1 year ago

@dwsmart would you know if adding schema for video would still be beneficial on an eCommerce product?

dwsmart commented 1 year ago

probably beyond the scope for discussion on the issues of a plugin, but can be beneficial, if you think converting customers would be searching for videos, and then going on to buy. There's no one answer there really.

SagaciousBen commented 1 year ago

I switched to a facade script years ago (5+), and I'm just now discovering why my website has precipitously dropped. This script is significantly better than the script I use, so I'm going to switch regardless. But I am on the edge of my seat reading this to see if this has been implemented. I'm not a coder, but it blows my mind to see (what seems to me) extremely intelligent and capable coders not understanding the value of this change. I'd pay $200, this second, for an implementation that includes future-proof video schema. Im leaning towards just using the youtube embeds and eating the extra load time.

redsector72 commented 1 year ago

I've done some testing of mine with this. If we follow the idea from @dwsmart what happens is that the noscript code become actually scripted and executed (the browser create the iframe and download all the YT scripts and even the images). This defeat the full purpose of using YT lite as you're downloading all YT stuff (twice if the video gets clicked") plus the yt lite stuff. You can check this behavior looking at the "elements" or "network" tab of any web inspector

dwsmart commented 1 year ago

@redsector72 Looks like a change in the way chrome handles injected <noscript>

If you change (edit for clarity in my fork, which I've updated)

noscriptEl.appendChild(iframeEl);

to

noscriptEl.innerHTML = iframeEl.outerHTML;

It behaves as before and doesn't download the resources.

dwsmart commented 1 year ago

FYI search console released the video indexing stuff in the URL inspection tool, so you can now see definitively the difference:

noscript version - shows video detected https://testing.tamethebots.com/lite/with-noscript.html url inspection tool > live test: image

without noscript - no video detected https://testing.tamethebots.com/lite/default-lite-youtube.html > live test: image

Schema is still the better overall option, if you can add it, <noscript> a useful fallback if you can't though I.M.H.O.

redsector72 commented 1 year ago

Hi @dwsmart, maybe i got misunderstood. The question is that with your implementation, probably because of a bug, the noscript section get actually rendered. That's why it get indexed. You can check it gets rendered by loading your page in any browser. It will loads all the YT fonts/scripts. Actually it does that twice. Once at start, once when you start the video (the second time it's cached, but still uses memory).

dwsmart commented 1 year ago

@redsector72

I understood you. Indeed something had changed in how chrome handled elements injected into a <noscript> with appendChild() so it requested resource. Quite when, not sure.

Switching to .innerhtml = makes it behave like a normal noscript again, and your browser won't request the resources, but Google can still see it.

Chrome: chrome devtools showing network activity

Firefox: firefox devtools showing network activity

caub commented 1 year ago

For indexation, you should rather look into https://developers.google.com/search/docs/appearance/structured-data/video (microdata or json-ld)

paulirish commented 4 months ago

@dwsmart dave, thank you for diving into this.. both proving a problem and finding a solution. :)

also.. hi! i use your crux pages on tame-the-bots. :) great stuff.

I landed your commit 83334447cd9c7b57101a0a86b8f3d17ac9e62105 and did some refactors afterwards: 12cbfa1ad641c2d66dcd4176df69e3c7c125db2c

this should now be.. FIXED! :) It's not opt-in. I have the noscript element enabled in all cases.


@caub if there's a structured data solution here.. i'm definitely curious. it'd require some experimentation.

dwsmart commented 4 months ago

Thanks @paulirish!

Genuinely chuffed you find the crux stuff helpful!