goodhosts / vagrant

Vagrant plugin that use goodhosts to manipulate hosts files
https://rubygems.org/gems/vagrant-goodhosts/
MIT License
17 stars 8 forks source link

Wait for popen3 execution of goodhosts #43

Open conradolandia opened 3 years ago

conradolandia commented 3 years ago

I block malware sites and ads from the internet using a large hosts file (600.000+ lines), redirecting all these bad hosts to 0.0.0.0. When I try to use goodhosts, it just freezes and whatever process I was on won't go on. If I remove most of the entries form /etc/hosts, the process completes without problems. Is this a bug or expected behavior?

Mte90 commented 3 years ago

I think that https://github.com/goodhosts/cli doesn't support a big hosts file at all. I used to have an hosts file huge like you but was like 100000~ lines and was working. I can't do anything for this as this is not a vagrant issue but the cli tool that the plugin uses.

conradolandia commented 3 years ago

Thank you, will take the problem there.

tomjn commented 3 years ago

@Mte90 how do you know it's goodhosts itself that's stalling?

Mte90 commented 3 years ago

So I had in the past a lot of hosts inside the file and goodhosts was parsing all of them for rewriting (clean as example) and this was changing the whole file and delay everything. As the plugin just call the command with the hosts we need to add the stall should be on the goodhosts side as need to parse 600000~ urls for cleaning and look if there are duplicates and so on.

At the end I removed everything on my hosts file and there are no issues anymore, also as goodhosts development is stalled I didn't opened a ticket for it.

tomjn commented 3 years ago

@Mte90 that's cleaning, we don't clean anymore on VVV where this was first seen, haven't for a long time.

But even ignoring that, you've shared an anecdote, no actual data has been shared proving it, just assumptions. Instead, @conradolandia should be able to test the theory by trying to add a host with the goodhosts CLI to see if it still hangs. If so, it's the CLI. If not, it's the vagrant plugin.

You're also assuming they have the latest version of the plugin, we don't know that

tomjn commented 3 years ago

@conradolandia grab the latest goodhosts from https://github.com/goodhosts/cli/releases and try to use it to add a host, e.g. goodhosts add 127.0.0.1 example.test

tomjn commented 3 years ago

and run vagrant plugin list --local to get the version of the plugin

conradolandia commented 3 years ago

Results:

Downloaded the goodhosts binary from https://github.com/goodhosts/cli/releases, where it reports the last version as 1.0.7. It adds the entry with no problems, even in the 600.000+ lines file, where it takes a couple seconds, but it makes the addition. Removing also works fine.

$ vagrant plugin list --local
vagrant-goodhosts (1.1.0, local)
Mte90 commented 3 years ago

How much times takes? maybe is the ruby api that has issues with the timeout or similar.

conradolandia commented 3 years ago
~ wc -l /etc/hosts
610748 /etc/hosts
~ time sudo goodhosts add 127.0.0.1 loquesea.test
hosts entry added: 127.0.0.1 loquesea.test
sudo goodhosts add 127.0.0.1 loquesea.test  3,61s user 0,37s system 156% cpu 2,545 total
Mte90 commented 3 years ago

So I guess that the issue is at https://github.com/goodhosts/vagrant/blob/master/lib/vagrant-goodhosts/GoodHosts.rb#L119 A solution could be execute the command in a different thread but this not ensure that it works always.

Looking at the doc a pid is created but we are no printing it, https://ruby-doc.org/stdlib-2.4.2/libdoc/open3/rdoc/Open3.html#method-c-popen3

We can add a code that wait for the execution of the process like this https://stackoverflow.com/questions/11710542/wait-for-popen3-process-to-finish but I am not sure.

tomjn commented 3 years ago

we should wait for the command to finish before proceeding otherwise how do we know the command is finished, and how do we avoid multiple attempts happening at the same time from the same vagrant command

YouJiacheng commented 1 year ago

Can we use backticks directly as suggested by following chart?

Mte90 commented 1 year ago

We need the output of the command including also stderr https://github.com/goodhosts/vagrant/blob/main/lib/vagrant-goodhosts/GoodHosts.rb#L138

YouJiacheng commented 1 year ago

So Open3.capture3 should be a drop-in replacement?

Mte90 commented 1 year ago

We already use popen3 https://github.com/goodhosts/vagrant/blob/main/lib/vagrant-goodhosts/GoodHosts.rb#L113 so I need to know what are the differences between the two methods.

tomjn commented 1 year ago

What’s the reason for the change? Calling the CLI works already so I don’t see any improvements to be had here and it sounds like it already uses the optimum method

On Wed, 7 Dec 2022 at 19:03, Daniele Scasciafratte @.***> wrote:

We already use popen3 https://github.com/goodhosts/vagrant/blob/main/lib/vagrant-goodhosts/GoodHosts.rb#L113 so I need to know what are the differences between the two methods.

— Reply to this email directly, view it on GitHub https://github.com/goodhosts/vagrant/issues/43#issuecomment-1341442875, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAOLZ5GZZNQ2JNXZBWJKVLWMDNQRANCNFSM5GAG5LCA . You are receiving this because you commented.Message ID: @.***>

YouJiacheng commented 1 year ago

capture3 is a light weight wrapper of block form popen, which waits for the process when it returns. Okay I found that wait_thr.value also waits for the termination of the process. So why this issue is not closed?

YouJiacheng commented 1 year ago

Okay, it might be the deadlock problem of popen3.

YouJiacheng commented 1 year ago

https://www.rubydoc.info/stdlib/open3/Open3.popen3

You should be careful to avoid deadlocks. Since pipes are fixed length buffers, Open3.popen3(“prog”) {|i, o, e, t| o.read } deadlocks if the program generates too much output on stderr. You should read stdout and stderr simultaneously (using threads or IO.select). However, if you don't need stderr output, you can use Open3.popen2. If merged stdout and stderr output is not a problem, you can use Open3.popen2e. If you really need stdout and stderr output as separate strings, you can consider Open3.capture3.

YouJiacheng commented 1 year ago

Not only the stderr, since we never read from stdout before join the waiting thread(wait_thr.value), too much output on stdout will cause deadlock as well. Even worse, we don't close stdin, stdout and stderr, which is required by docs: "stdin, stdout and stderr should be closed explicitly in this form."

stdin, stdout, stderr, wait_thr = Open3.popen3([env,] cmd... [, opts])
pid = wait_thr[:pid]  # pid of the started process
...
stdin.close  # stdin, stdout and stderr should be closed explicitly in this form.
stdout.close
stderr.close
exit_status = wait_thr.value  # Process::Status object returned.
YouJiacheng commented 1 year ago

According to "When I try to use goodhosts, it just freezes and whatever process I was on won't go on.", it seems a deadlock. However, the cli never output too much. Anyone can reproduce the issue?

YouJiacheng commented 1 year ago

Since capture3 is more robust than popen3, can we use it instead? Hope that can solve this issue.

Mte90 commented 1 year ago

I am not able to reproduce the issue but I think that to test it is required a big hosts file that take a lot to parse it.

tomjn commented 1 year ago

@Mte90 I'm not sure why a big hosts file would matter here as this code isn't inside goodhosts, it's the vagrant plugin that calls it, no host files are being touched by ruby itself

tomjn commented 1 year ago

Since capture3 is more robust than popen3, can we use it instead? Hope that can solve this issue.

can you make a PR?

Mte90 commented 1 year ago

The problem is the execution of the cli command that takes a while and probably this freeze something in vagrant. In our case the output is very tiny anyway from this so I think that it is something to be improved on the cli side and not on the vagrant side.

tomjn commented 1 year ago

If the user walks away and comes back to a sudo or UAC prompt 5 minutes later that’s no different to goodhosts taking 5 minutes to handle a massive hosts file so this has to be handled at this end too if it’s a problem

On Fri, 9 Dec 2022 at 14:02, Daniele Scasciafratte @.***> wrote:

The problem is the execution of the cli command that takes a while and probably this freeze something in vagrant. In our case the output is very tiny anyway from this so I think that it is something to be improved on the cli side and not on the vagrant side.

— Reply to this email directly, view it on GitHub https://github.com/goodhosts/vagrant/issues/43#issuecomment-1344344917, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAOLZ3R7LYVKMSHDGFF4N3WMM3XVANCNFSM5GAG5LCA . You are receiving this because you commented.Message ID: @.***>