philippta / flyscrape

Flyscrape is a command-line web scraping tool designed for those without advanced programming skills.
https://flyscrape.com
Mozilla Public License 2.0
1.02k stars 29 forks source link

failed to launch browser (Win 11 x64) #54

Closed dynabler closed 6 months ago

dynabler commented 6 months ago

Windows 11, x64.

I tried the browser option, and it downloads and unpacks just fine, but then I get the following error:

failed to launch browser: fork/exec C:\Users\HP\AppData\Local\Temp\leakless-0c3354cd58f0813bb5b34ddf3a7c16ed\leakless.exe: Operation did not complete successfully because the file contains a virus or potentially unwanted software.

Not sure how to solve this. It's the security thingie hell of Windows 11. I also wonder, most people have a browser installed already, could just the default installed browser be used?

Second, let's say you do want to run a (headless) browser, where should it be placed? Maybe it makes more sense to have something like this: flyscrape run script.js --browser: C:\browser\chro.exe

Using the default installed browser or specifying where a second one is located has benefits: browser scraping is utterly slow because it uses massive amounts of memory, so I personally have extensions installed to turn off images for example or turn off JavaScript.

If flyscrape used the default browser or allows specifying the second instance of chromium, using extensions is possible.

philippta commented 6 months ago

Thanks for reporting this issue. I was thinking about this for a long time but fighting against anti-virus software is something I would like to stay away from.

If possible try to mark the folder or leakless.exe as a trusted program in the anti-virus software.

Initially I was exploring the use of the existing browser, but let me tell you this would create more issues that are even more difficult to understand. As an example you always had to make sure that your existing browser is full closed. Otherwise flyscrape could not remote control it (it's a security feature). This also has the downside, that you won't be able to use it while writing your scraping script. So going back and forth between flyscrape and your browser is terrible for the user experience.

This and a few other minor problems brought me to the conclusion to make flyscrape always use its own, purpose-fit browser instead of the existing one.

Unfortunately I don't have a Windows 11 PC and can't help you troubleshoot this problem. But if you find a way to solve it, please share your findings here.

dynabler commented 6 months ago

The sentiment here is the same. Dealing with Windows security is a full-time job. I will try your suggestions when I get around to it and share if I get it to work. I think WSL as mention in installation docs is the way to go.