Open tidianelyahmed opened 3 years ago
@tidianelyahmed , could you provide your code and input URLs 😃? Let's see what can we find!
Thank you for your reply.
Above you will see my CLI and the header of my csv file(URL):
lighthouse-batch-parallel -n 64 -l input.csv
Device,URL desktop,http://www.auto-doc.fr/pieces-detachees/joint-de-collecteur-dechappement-10331/citroen desktop,https://www.tracteurpool.fr/details/Tracteurs-agricoles-standard/Renault-321-4-carraro/4255416/ desktop,http://www4.total.fr/Europe/France/pdf/lubrifiants-moteur/TOTAL-TRANSMISSION-GEAR-8-FE-75W-80-032015-FR.pdf desktop,https://m.piecesauto-pro.fr/pieces-detachees/filtre-a-huile-hydraulique desktop,https://www.mister-auto.com/batterie-voiture/60ah/ desktop,https://www.leboncoin.fr/materiel_agricole/offres/bretagne/cotes_d_armor/p-8/ desktop,https://www.60millions-mag.com/forum/communication-et-internet/oscaro-pieces-auto-discount-t3093.html desktop,https://agripelle.com/creat_pdf2.php?id_prod=3049 desktop,https://www.yakarouler.com/pieces-auto/m/renault desktop,https://www.agriconomie.com/pieces-agricoles/tracteur/signalisation---eclairage/feu/pc6108/john-deere-b320/tracteur-john-deere-4240s-m246774 desktop,https://www.cairn.info/publications-de-Olivier-Van%20der%20Noot--119507.htm desktop,http://www.avenirmotoculture.com/ulf/promodis/fichiers/1/CMP/Formuledirecte/fdsol.PDF desktop,https://www.calameo.com/books/001104130b9472ba69bac desktop,http://www.em-consulte.com/getInfoProduit/471527/extrait/chapitre_471527.pdf desktop,https://www.mabeo-direct.com/A-555290-total-huile-hydraulique-azolla-zs-46-20l desktop,https://www.mister-auto.com/batterie-voiture/bolk/bol-e051060/ desktop,https://www.global-equipement.fr/wp-content/uploads/2013/07/Malaxeur-B53.pdf desktop,https://fr.shopping.rakuten.com/s/jouet+betonniere desktop,https://www.largus.fr/dictionnaire/conduite/sur-les-chapeaux-roue-9864408.html desktop,https://www.vdm-reya.com/fusees desktop,https://www.promodis.fr/massey-ferguson-6290-fr-fr.htm desktop,http://www.va-france.com/kit-embrayage.html desktop,http://www.rmtelevagesenvironnement.org/docs/fiches/gbpee/commun/pvb-fiche-16.pdf desktop,https://www.groupeselectrogenes.fr/portfolio-item/ats-switch desktop,https://www.codimatra.fr/f/pieces-hydraulique.html desktop,https://m.piecesauto-pro.fr/febi-bilstein-8008077 desktop,https://fr.shopping.rakuten.com/offer/buy/4221123288/siku-5601.html desktop,https://www.jma-dsm.de/dsm-loesung/schnittstelle-dsm/220-cnh-schnittstelle desktop,https://www.tracteurpool.fr/usage/c-Kubota/308/model/B1820/ desktop,https://www.rueducommerce.fr/rayon/auto-9/ng/huile-boite-75w80 desktop,https://www.ducheminagt.be/gyrophares-flash-led-pro/658-gyrophares-led-aimante-sur-batterie-72994.html desktop,https://www.agri-pole.com/fr/agri-pole/cla/36661619
Hi~ @tidianelyahmed, I tested it at my local then I got the same error messages from URLs below. I collected them and tested them separately, but it ends up having no problem. Then I found all of these URLs are after the URLs which are actually PDF files, so I think that's the main problem, the browser instance cannot handle it properly and causes problems when it's reused for next process.
https://m.piecesauto-pro.fr/pieces-detachees/filtre-a-huile-hydraulique https://www.yakarouler.com/pieces-auto/m/renault https://www.calameo.com/books/001104130b9472ba69bac https://www.mabeo-direct.com/A-555290-total-huile-hydraulique-azolla-zs-46-20l https://fr.shopping.rakuten.com/s/jouet+betonniere https://www.groupeselectrogenes.fr/portfolio-item/ats-switch
I am not sure if you are intentional to test them, but actually, if you run the Lighthouse in the Devtools to monitor those PDF files, you also get a lot of errors. I think PDF reader on Chrome has different behavior in comparison to rendering a web page! Just for your information, maybe you would like to get rid of those PDF files first?😅
Hi, thank you for your reply. Indeed I deleted all the urls containing "pdf" and I sent 100 urls, the last 20 urls of input.csv had no errors.I redid for 1000 urls about 10% were OK, and these are the last urls of the table input.csv. I'm really lost. All bugs are about errno: 'ECONNREFUSED'.
@tidianelyahmed, sorry that I don't really get it, so do you mean after you get rid all of PDF URLs, then you put 1000 URLs in the input.csv, but only 10% of them are successfully scanned? So have you tried what I've done above, maybe collect the failure cases and audit them separately? Or check the URL before the failure cases?
I will test that. I also found that if I reduce the number of workers from 64 to 10 it works (90% of the URLS are Ok). I think the ports have something to do with it too. A month ago the cli:lighthouse-batch-parallel -n 64 -l input.csv did not give any error.If it persists I will reinstall it on another VM( i am in gcp). Thanks, I'll keep you informed.
Thanks, @tidianelyahmed! Sorry that I don't have a GCP environment. If you find out any clue, please update more info here!
This is my first time working with lighthouse-batch-parallel and Node.js and I ran into this problem: I have this error on most of my urls, have you ever had this error?
Error: connect ECONNREFUSED 127.0.0.1:39855 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) { errno: 'ECONNREFUSED', code: 'ECONNREFUSED', syscall: 'connect', address: '127.0.0.1', port: 39855 } Device = desktop || URL = https://www.vdm-reya.com/fusees Total: 200 || Remain: 183 Error: connect ECONNREFUSED 127.0.0.1:37451 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) { errno: 'ECONNREFUSED', code: 'ECONNREFUSED', syscall: 'connect', address: '127.0.0.1', port: 37451 } Error: connect ECONNREFUSED 127.0.0.1:33513 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) { errno: 'ECONNREFUSED', code: 'ECONNREFUSED', syscall: 'connect', address: '127.0.0.1', port: 33513 }