verlok / vanilla-lazyload

LazyLoad is a lightweight, flexible script that speeds up your website by deferring the loading of your below-the-fold images, backgrounds, videos, iframes and scripts to when they will enter the viewport. Written in plain "vanilla" JavaScript, it leverages IntersectionObserver, supports responsive images and enables native lazy loading.
https://www.andreaverlicchi.eu/vanilla-lazyload/
MIT License
7.72k stars 676 forks source link

Make sure Google can see lazy-loaded content #277

Closed jimmleon closed 5 years ago

jimmleon commented 6 years ago

I tried to run a test for my website as suggested in this link: https://developers.google.com/search/docs/guides/lazy-loading by using a puppeteer script but the test results are: "Lazy images loaded correctly: Failed".

I then ran the test for the 2 demos provided in the docs, concerning the responsive images lazyload (srcset & with picture tag) https://www.andreaverlicchi.eu/lazyload/demos/with_srcset_lazy_sizes.html and the result was once again "Lazy images loaded correctly: Failed".

However, running the same test for the normal img demo (no srcset, no picture) the result was "passed".

Any idea what is wrong?

verlok commented 5 years ago

Hi @DimLeon, I'm sorry that I'm so late in replying but I've been very busy in the latest days.

That's a good point, and a good question, thank you for asking it.

I didn't get what is the "normal img demo" that is correctly working, but I'll do some tests myself using puppeteer too. I'll keep you updated.

verlok commented 5 years ago

Hey @DimLeon, I don't understand why, I did the following to test the "simple" demo...

node lazyimages_without_scroll_events.js -h --url https://www.andreaverlicchi.eu/lazyload/demos/simple.html

...and the results from puppeteer, which FAILED, says:

"If there are more images in the screenshot below, the page is using scroll events to lazy load images. Instead, consider using another approach like IntersectionObserver."

My objections:

I copied the way of detecting a bot from LazySizes, which is the first script that the Google Developers page advices. :)

So help me out here, I don't understand what to do next to fix this issue.

jimmleon commented 5 years ago

Thanks for the reply @verlok. Also thank you for this plugin. By "normal img demo" i meant the "simple" lazyload demo, my bad 🙂. I tried running the script shown in this page: https://github.com/GoogleChromeLabs/puppeteer-examples/blob/master/lazyimages_without_scroll_events.js. I once again attempted to test all demo urls found on the lazyload "Recipes" section. the only demos that 'passed' where the "dynamic content", the "Lazy Lazyload" and the "Iframes". The result i get in the failed ones is:

Lazy images loaded correctly: Failed Found 1971192 pixels differences. Dimension image A: 250x2874 Dimension image B: 250x2874

The result in the "passed" demos is:

Lazy images loaded correctly: Passed Found 0 pixels differences. Dimension image A: 250x500 Dimension image B: 250x500

Doesn' t really make sense to me. I don't know how to help, since there is no scroll events in your code and Intersection Observer is used instead. Please let me know if you figure anything out.

I' will try to use the Googlebot Images crawler as indicated here: https://support.google.com/webmasters/answer/6066468?hl=en and see the results Cheers!

rasulovdev commented 5 years ago

I'm experiencing the same issues. I have ordinary img with data-src and elements with data-bg. Page doesn't pass Puppeteer tests.

i've tried:

It never passes🤷‍♂️

On the first image (Page without being scrolled) there are no images, on the last (Page after scrolling) – some of them.

EDIT: In my case this is caused by first screen being 100vh. After removing that rule, Puppeteer output screenshots are the same.

Test says "FAILED", but it's due to me using some parallax, so when two screenshots compared, images are misaligned after scroll.

dan-ding commented 5 years ago

Setting small dimensions (400 x 300) in that puppeteer script, I have yet to find a lazyload example (this or other libraries/script) that passes. Medium articles pass -- however Medium articles also appear fine with javascript disabled entirely...

testing: https://rawgit.com/GoogleChromeLabs/puppeteer-examples/master/html/lazyload.html fails with the small viewport

which makes me wonder if the script is a good example; puppeteer is creating a page and taking a screenshot of it in it's entirety - that function appears to not be firing intersectionObserver (https://github.com/GoogleChromeLabs/puppeteer-examples/blob/59355609ecb3c2e396a289b28f34d5116fc89b8e/lazyimages_without_scroll_events.js#L131) ???

I think using a placeholder - actually having a src - is key to making everyone happy. besides, not having a src isn't great for accessibility

verlok commented 5 years ago

Would you try with version 11.0.0?

otterslide commented 5 years ago

Would you try with version 11.0.0?

I just tried this version and I fetched as Google my site. Only the first image shows, the rest don't load.. but additionally Google ends up failing to even run masonry, so it's as if it runs into some error. I'm also using srcset. If I take out the data-src and data-srcset from my images, fetch as google loads fine and masonry runs ok.

I'm not sure what could be the issue. Does the LazyLoad() have to be run after document.ready() ? Or can it be called anywhere? I called it in right after loading the script..

I tried with data-src only, and no srcset, and still same problem. On my end it loads fine however, but it seems Google runs into some issue.

verlok commented 5 years ago

Yes, DOM needs to be ready. To be sure of that, you can put the script that create the instance of lazyLoad at the end of the <body> tag, just before it’s closing tag.

From the recently updated README (that I invite you to read and give feedback, especially the “Getting started” section):

Be sure that DOM is ready when you instantiate LazyLoad. If you can't be sure, or other content may arrive in a later time via AJAX, you'll need to call lazyLoadInstance.update(); to make LazyLoad check the DOM again.

Hope this helps!

verlok commented 5 years ago

Moreover, is there any chance that JS might be disabled in the Chrome run by Puppeteer?

dan-ding commented 5 years ago

It runs @verlok as far i can see -- with the exception of when a screenshot is taken. it's easy to replicate with chrome dev tools.

the print preview is missing all the images which haven't been scrolled to; they don't show as they don't have a src set;

verlok commented 5 years ago

Thank you @dan-ding for your explanation, but that’s not the case because the LazyLoad script reads the user agent string and, if found to be a Bot like Google Bot or Bing Bot or others, triggers the loadAll() method immediately. So when a browser is visiting the page behaves differently from when it’s a search engine crawler.

dan-ding commented 5 years ago

Ya -- just trying to help(?) some understand what the puppet is doing.

I'm not a fan of the ua sniffing, however i'm not raising an issue with it.

verlok commented 5 years ago

I'm not a fan of the ua sniffing

Me neither, but I didn’t find another way to do that.

If anyone have any suggestions, you’re welcome.

verlok commented 5 years ago

image

verlok commented 5 years ago

This is the code LazyLoad uses to detect whether or not the browser is a bot.

export const runningOnBrowser = typeof window !== "undefined";

export const isBot =
    (runningOnBrowser && !("onscroll" in window)) ||
    (typeof navigator !== "undefined" &&
        /(gle|ing|ro)bot|crawl|spider/i.test(navigator.userAgent));

First of all, it checks if we're running on a browser, then if the browser has the onscroll function in the window object, and at last it tries to do "ua sniffing" to check if it's a crawler.

In my previous comment, I tested that code in the browser console and as you can see, if window.onscroll is falsy, isBot becomes true

Also other lazyload libraries use this same technique, so I guess this is working, but:

1) I cannot test this using Puppeteer 2) I'm not sure Puppeteer has the window.onscroll function empty, or if its user agent contains the google bot user agent.

Can you help me out with this?

dan-ding commented 5 years ago

@verlok i wanted to avoid the question of the UA as I understand why you put it in. It wasn't meant to be a complaint. I have only one possible idea, but haven't tested it yet, so it may be useless...

Also -- the example puppeteer script does not set the UA to googlebot without modification.

I'll write up some better tests (since that example doesn't use the lazyload we care about ;) and get back to you

verlok commented 5 years ago

Thank you 🙏🏼

otterslide commented 5 years ago

Google scrolls down just like any other browser. I am using another Jquery script that does not do any UA testing and Fetch as google was loading every image fully. At least for Google, and probably major Bots, it is not necessary to check. In fact, Google will visit the website with multiple user agents, without announcing itself as Googlebot, just to make sure you're not cloaking your site or serving something that is different only to bots.

I should've tested this plugin better, but now the good fetch as Google tool that fetched the first 10,000pixels has been taken offline, and it's impossible to test lazy image loading any more, because the new tool only shows the first image. Very sad Google has chosen to do this.

verlok commented 5 years ago

Thank you for your contribution @otterslide. I really don’t know how to test this as Google.

Just to make it clearer, LazyLoad first checks if the onscroll event is present in the window object (if it isn’t, it’s likely to be a crawler), then checks if the user agent matches any bot using a regular expression (just in case). If one of the previous, load all images immediately.

otterslide commented 5 years ago

Using the old Fetch as Google tool, this script was only loading the first image for me, which was very odd. The new fetch as google tool does not scroll down at all so I cannot test any lazy loading scripts any longer. I'd encourage everyone to click the feedback button in the webmaster fetch as google tool and complain that the screenshot is too small.

Having said that, it is said that Googlebot uses Chrome 41. I tested your demo page with Chrome 41 and it pre-loads all images right off the start, so probably Chrome 41 doesn't have scroll events.

Unfortunately, I am finding that scroll event scripts are being deferred by ad loading. The scripts initialize, but Google DFP ads somehow block the entire script from loading anything until the ads are completely loaded. Maybe they block the scroll event, not sure.. With IntersectionObserver I am finding that this doesn't happen, and loading is quite a bit faster. I also can use polyfill and even on Chrome 41 the images can be lazy loaded. As you probably know, intersection observer runs in a separate thread as well, which should be better for performance. I didn't want to open another issue for this, let me know if I should, I'd be interested in your input on the possibility of other scripts interfering with the lazy load script. I only noticed recently because some ads were taking very long to load. It would take a good 5-10seconds to start loading any image.

Thanks.

knoxcard commented 5 years ago

I believe I solved this issue tonight, it finally dawned upon me...

new LazyLoad({
    elements_selector: '.lazy'
})
$('img').one('error', function(err) {
    $(this).remove()
}).one('load', function() {
    $(this).attr('draggable', 'false')
    $(this).attr('src', $(this).attr('data-src'))
    $(this).removeAttr('data-src')
})
verlok commented 5 years ago

@knoxcard thanks for trying comment.

It just seems to add two event listeners to each image,

What does this kind of code solve?

jimmleon commented 5 years ago

Some conclusions i came up to:

I used the google's search console and made some url inspection for my website. (the old 'fetch as google').

Unfortunately, my lazy images are not shown in the screenshot preview. However, when i made a google search of a random url of my website and clicked "view cached", the lazy images in the cached preview of that page were displayed properly. The cached page -as far as i know- is a snapshot of the page by googlebot and the content in that snapshot is the crawlable content by google.

In relevance to the above, some 'seo-related best practices' advise developers using image lazy-loading techniques to add a <noscript> tag including the <img src="">.

Any thoughts on this or if this is indeed a good practice would be more than helpful.

knoxcard commented 5 years ago

@verlok - html standards require <img> tags to have a src attribute. When I ran the source code produced by this library here: https://validator.w3.org/#validate_by_input, warnings and errors are displayed.

jimblue commented 5 years ago

Some ideas to check if user is a bot and load all images:

Very powerfull but heavy: https://github.com/JefferyHus/es6-crawler-detect https://github.com/gorangajic/isbot

A faster version: https://github.com/mahovich/isbot-fast

juanmardefago commented 5 years ago

@verlok As stated in the PR for the puppeteer script (https://github.com/GoogleChromeLabs/puppeteer-examples/pull/30) that is referenced on the google docs, adding the UA to the script makes it work.

Even though, the results should always be interpreted by a human, since I used that for testing the site I'm working on right now, and the test "failed", but the images were almost exactly the same, with the exception that the images at the far bottom had a different opacity, which, in my case, means that they were the last ones to be scrolled and that their 1sec opacity animation was just finishing. Of course the script would not know that, but since those were the only pixels which had a variation, I can take that result as a good one nonetheless.

I'll ask our client if we can upload those images to better exemplify it, since it might be somewhat confusing haha.

juanmardefago commented 5 years ago

Also, since adding the UA to that script made it work, I guess that the onscroll event and window object exists on that puppeteer script, which might not be exactly the same as when the googlebot crawls the site.

Also, the script supposedly should tell whether or not a lazyload approach uses scroll events or Intersection Observer API, but I'm guessing it's not working as expected, since this solution uses the IO API.

For the time being, I wouldn't worry too much about this.

juanmardefago commented 5 years ago

Here's the picture of how it ended up

page_diff

The script returned a Failed result, since there are lots of pixels highlighted in the diff, but I consider it a pass.

verlok commented 5 years ago

Thank you all!