Closed jensolafkoch closed 3 years ago
I believe it hangs when accessing images. Next time it was:
FetchError: request to https://tailwindui.com/img/category-thumbnails/sections/heroes.png failed,
Sorry. That’s another place where fetch is called. I’ll add retry logic.
I refactored the code to use fetchWithRetry everywhere. Please try the latest on master
branch.
Both attempts (with just html and then all languages) finished successfully. (FYI: It took a minute or two after the last directory/file was written locally before the "Done" message appeared finally.)
Yes. If you set BUILDINDEX=1 then the final index.html page also downloads the thumbnails used in the component categories.
There are 64 images there so if your Internet is slow, it may take a while.
Right now it downloads everything all the time. I can add a check to see if the file already exists and send an If-Modified-Since header to only download updated files.
I logged timings and yeah, it adds up.
333 ms https://tailwindui.com/img/category-thumbnails/sections/heroes.png
286 ms https://tailwindui.com/img/category-thumbnails/sections/feature-sections.png
307 ms https://tailwindui.com/img/category-thumbnails/sections/cta-sections.png
191 ms https://tailwindui.com/img/category-thumbnails/sections/pricing.png
360 ms https://tailwindui.com/img/category-thumbnails/sections/header.png
499 ms https://tailwindui.com/img/category-thumbnails/sections/faq-sections.png
178 ms https://tailwindui.com/img/category-thumbnails/sections/newsletter-sections.png
192 ms https://tailwindui.com/img/category-thumbnails/sections/stats-sections.png
316 ms https://tailwindui.com/img/category-thumbnails/sections/testimonials.png
168 ms https://tailwindui.com/img/category-thumbnails/sections/blog-sections.png
159 ms https://tailwindui.com/img/category-thumbnails/sections/contact-sections.png
151 ms https://tailwindui.com/img/category-thumbnails/sections/team-sections.png
277 ms https://tailwindui.com/img/category-thumbnails/sections/content-sections.png
341 ms https://tailwindui.com/img/category-thumbnails/sections/footers.png
395 ms https://tailwindui.com/img/category-thumbnails/sections/logo-clouds.png
165 ms https://tailwindui.com/img/category-thumbnails/elements/headers.png
398 ms https://tailwindui.com/img/category-thumbnails/elements/banners.png
182 ms https://tailwindui.com/img/category-thumbnails/elements/flyout-menus.png
182 ms https://tailwindui.com/img/category-thumbnails/page-examples/landing-pages.png
182 ms https://tailwindui.com/img/category-thumbnails/page-examples/pricing-pages.png
171 ms https://tailwindui.com/img/category-thumbnails/page-examples/contact-pages.png
294 ms https://tailwindui.com/img/category-thumbnails/application-shells/stacked.png
412 ms https://tailwindui.com/img/category-thumbnails/application-shells/sidebar.png
284 ms https://tailwindui.com/img/category-thumbnails/application-shells/multi-column.png
415 ms https://tailwindui.com/img/category-thumbnails/headings/page-headings.png
302 ms https://tailwindui.com/img/category-thumbnails/headings/card-headings.png
434 ms https://tailwindui.com/img/category-thumbnails/headings/section-headings.png
154 ms https://tailwindui.com/img/category-thumbnails/data-display/description-lists.png
284 ms https://tailwindui.com/img/category-thumbnails/data-display/stats.png
303 ms https://tailwindui.com/img/category-thumbnails/lists/tables.png
365 ms https://tailwindui.com/img/category-thumbnails/lists/stacked-lists.png
111 ms https://tailwindui.com/img/category-thumbnails/lists/grid-lists.png
177 ms https://tailwindui.com/img/category-thumbnails/lists/feeds.png
198 ms https://tailwindui.com/img/category-thumbnails/forms/form-layouts.png
411 ms https://tailwindui.com/img/category-thumbnails/forms/input-groups.png
273 ms https://tailwindui.com/img/category-thumbnails/forms/select-menus.png
393 ms https://tailwindui.com/img/category-thumbnails/forms/sign-in-forms.png
161 ms https://tailwindui.com/img/category-thumbnails/forms/radio-groups.png
353 ms https://tailwindui.com/img/category-thumbnails/forms/toggles.png
159 ms https://tailwindui.com/img/category-thumbnails/forms/action-panels.png
158 ms https://tailwindui.com/img/category-thumbnails/feedback/alerts.png
278 ms https://tailwindui.com/img/category-thumbnails/navigation/navbars.png
189 ms https://tailwindui.com/img/category-thumbnails/navigation/pagination.png
332 ms https://tailwindui.com/img/category-thumbnails/navigation/tabs.png
296 ms https://tailwindui.com/img/category-thumbnails/navigation/vertical-navigation.png
291 ms https://tailwindui.com/img/category-thumbnails/navigation/sidebar-navigation.png
345 ms https://tailwindui.com/img/category-thumbnails/navigation/breadcrumbs.png
168 ms https://tailwindui.com/img/category-thumbnails/navigation/steps.png
339 ms https://tailwindui.com/img/category-thumbnails/overlays/modals.png
163 ms https://tailwindui.com/img/category-thumbnails/overlays/slide-overs.png
309 ms https://tailwindui.com/img/category-thumbnails/overlays/notifications.png
400 ms https://tailwindui.com/img/category-thumbnails/elements/avatars.png
161 ms https://tailwindui.com/img/category-thumbnails/elements/dropdowns.png
177 ms https://tailwindui.com/img/category-thumbnails/elements/badges.png
462 ms https://tailwindui.com/img/category-thumbnails/elements/buttons.png
170 ms https://tailwindui.com/img/category-thumbnails/elements/button-groups.png
278 ms https://tailwindui.com/img/category-thumbnails/layout/containers.png
341 ms https://tailwindui.com/img/category-thumbnails/layout/panels.png
177 ms https://tailwindui.com/img/category-thumbnails/layout/list-containers.png
163 ms https://tailwindui.com/img/category-thumbnails/layout/media-objects.png
403 ms https://tailwindui.com/img/category-thumbnails/layout/dividers.png
353 ms https://tailwindui.com/img/category-thumbnails/page-examples/home-screens.png
179 ms https://tailwindui.com/img/category-thumbnails/page-examples/detail-screens.png
337 ms https://tailwindui.com/img/category-thumbnails/page-examples/settings-screens.png
📝 Writing /components/index.html
🏁 Done!
Ok, I added support for If-Modified-Since... but it doesn't seem to improve performance all that much since the files are pretty small, it still takes time to make the request.
Anyway, I also include more logging so hopefully you can see what's going on.
Get latest and let me know how it goes for you.
Looks fine. There are still some long delays (much longer than the ms values shown) so maybe my local dev environment (Windows, with Laragon as kind of XAMPP) has something todo with it? Anyway, as long as it runs till the end, I'm happy! :-)
The messages are cut off, no big deal:
The time displayed is only showing the last successful one.
I guess I can log when it times out and has to retry. I'm pretty much brute forcing the connection. Keep trying until it's successful or 3 retries.
Yes, I truncated to 80 characters to prevent terminal wrapping since these URLs were getting very long.
I am absolutely happy as it is - thanks again for your package! :-)
If you check for existing files, do you also purge files an directories and preview images which are no longer exist? I'm just curious whether to start from scratch every now and then or just update the current tree.
Not at the moment. But I'm rethinking the process. If you use the GitHub action, it always does a fresh checkout of the target repository. So all the file times will be right now. So when it sends the If-Modified-Since
header, 99.9% of the time, you'll get a 304 Not Modified
result.
A couple of options.
If-None-Match: etag
header. Still not sure if it buys us anything in time saved, but it's more accurate. Also, I can compare the assets just downloaded with the files on disk and remove any that are not in the list. Again, the files are small, so the space saving is negligible.I'll have to think on it.
I implemented option 2. It uses etags and will remove any files that are no longer referenced.
I'm going to close this. Re-open if you experience more timeouts. Thanks!
Today, the ETIMEDOUT error was back again. Difference is that I crawled all languages today (instead of just HTML), but I don't think it has anything to do with the timeout, as a lot of directories were processed correctly before:
‼️ FetchError: request to https://tailwindui.com/img/category-thumbnails/sections/content-sections.png failed, reason: connect ETIMEDOUT 172.67.217.20:443 at ClientRequest. (D:\laragon\www\xxx-tailwindui-crawler\node_modules\node-fetch\lib\index.js:1461:11)
at ClientRequest.emit (events.js:315:20)
at TLSSocket.socketErrorListener (_http_client.js:469:9)
at TLSSocket.emit (events.js:315:20)
at emitErrorNT (internal/streams/destroy.js:106:8)
at emitErrorCloseNT (internal/streams/destroy.js:74:3)
at processTicksAndRejections (internal/process/task_queues.js:80:21) {
type: 'system',
errno: 'ETIMEDOUT',
code: 'ETIMEDOUT'
}
(Maybe it would be helpful to show the timeout value as part of the error output by default?)