Closed taganaka closed 9 years ago
I think the overflow manager would be a perfect match for a plugin.
According to the current architecture, it would start on on_crawl_start
and finish (kill) on on_crawl_end
.
Second, shouldn't we call thread.exit
as you did it already in commit 68d00fafa81e9f2471dc2242925d52df2396c590?
I'm not still 100% convinced plugins are really needed. Here is a tentative to have the plugins architecture more clean and practical:
https://github.com/taganaka/polipus/compare/plugins
exposing current plugin hooks as public methods as we do for others DSL methods might remove the need of having plugins at all
At this point plugins are just simple class where an instance of Polipus
is passed to the initializer and then specific blocks of codes are added to the exposed methods.
Thread.exit
should not be needed here. Thread is not joined in the main thread, thus when the main thread is terminated, also all of the other threads will be killed
For me there is a strong use case for plugins, as the options list is to long in my opinion and the number of methods in PolipusCrawler
is to high.
I would give it access to the instance of PolipusCrawler
and the current instance of the Worker
.
Could you open a [WIP] pull request for the plugins branch so that we can discuss it there?
I have to admit, I don't have enough experience with threads in ruby.
But just because I quit PolipusCrawler#takeover
does not mean I quit the main process.
Even though #takeover
is done with it's job, the overflow manager would be still running, right?
So in some cases (e.g. rake tasks, pry/irb console, maybe tests), multiple threads with an overflow manager could be still running. This would be edge cases, of course.
def do_you_job(name)
while true
puts "#{name} is still working"
sleep 1
end
end
def takeover
Thread.new { do_you_job("Overflow Manager") }
workers =
3
.times
.map do |worker_number|
Thread.new do
puts "Worker #{worker_number} starting crawl session..."
sleep 3
puts "Worker #{worker_number} finishing crawl session..."
end
end
sleep 10
puts '10 Seconds are over. Joining.'
workers.join
end
takeover
takeover
takeover
would result in
Overflow Manager is still working
Worker 0 starting crawl session...
Worker 1 starting crawl session...
Worker 2 starting crawl session...
Overflow Manager is still working
xOverflow Manager is still working
Worker 1 finishing crawl session...
Worker 2 finishing crawl session...
Worker 0 finishing crawl session...
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
10 Seconds are over. Joining.
Worker 2 starting crawl session...
Overflow Manager is still working
Worker 0 starting crawl session...
Worker 1 starting crawl session...
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Worker 2 finishing crawl session...
Worker 0 finishing crawl session...
Worker 1 finishing crawl session...
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
10 Seconds are over. Joining.
Worker 2 starting crawl session...
Overflow Manager is still working
Worker 0 starting crawl session...Worker 1 starting crawl session...
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Worker 0 finishing crawl session...Worker 2 finishing crawl session...
Worker 1 finishing crawl session...
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
Overflow Manager is still working
10 Seconds are over. Joining.
Which is
approx. 10 times Overflow Manager is still working
for the first takeover
,
20 times Overflow Manager is still working
for the second takeover
and
30 times Overflow Manager is still working
for the third takeover
.
Coverage decreased (-0.5%) when pulling 12feef11f2c26f81be5fe0aaf42b967e4909f86b on overflow_items_controller_ref into 0e4b19adbce8d408994ce83b43ea6fe148821a55 on master.