Closed GoogleCodeExporter closed 9 years ago
I think, we should just reschedule next_scan on hash change. It could be done at
except block at spider.py, line 211.
The only problem is violating next_scan displayed in web interface, so this
should be
documented on faq page (Q: I have smb and ftp servers with the same content. Why
sometines one of them undergoes rescanning before time declared at share page?
A:
Shares with the same content are rescanned out of schedule if one of them has
changed).
Original comment by radist...@gmail.com
on 13 Apr 2010 at 9:45
>I think, we should just reschedule next_scan on hash change.
Then write exact SQL command. I've tried already but result was unsatisfying.
Keep in
mind that trees can converge, deverge and scanning time is unknown a priory
Original comment by ruslan.savchenko
on 13 Apr 2010 at 10:03
"UPDATE shares SET next_scan=now() WHERE tree_id=%s AND size>0" % oldtree_id,
but this query requires additional update for next_scan at spider.py, lines
127, 181
and 199. The latter could be done with additional query "UPDATE shares SET
next_scan=now()+interval %(i)s WHERE share_id=%(s)s AND next_scan<now()" to
prevent
double rewriting for next_scan, but I don't think that setting next_scan after
successful scan (and optional update) is wrong.
Original comment by radist...@gmail.com
on 13 Apr 2010 at 10:17
> "UPDATE shares SET next_scan=now() WHERE tree_id=%s AND size>0" % oldtree_id
It doesn't work when 3 spiders is running at the same time though.
Original comment by radist...@gmail.com
on 13 Apr 2010 at 10:21
May be, we need some configurable time limit for low-level scanners?
Original comment by radist...@gmail.com
on 13 Apr 2010 at 10:23
Well, may be you're right and the only we need is modified update query for
existing
tree case:
{{{
if size is not None:
if size > 0:
cursor.execute("SELECT next_scan FROM shares WHERE tree_id=%(t)s
LIMIT 1", {'t': tree_id})
if size > 0 and cursor.rowcount > 0:
cursor.execute("""
UPDATE shares SET tree_id = %(t)s, size = %(sz)s, last_scan =
now(), next_scan=%(n)s
WHERE share_id = %(s)s;
""", {'s':share_id, 't':tree_id, 'sz': size, 'n':
cursor.fetchone()[0]})
else:
cursor.execute("""
UPDATE shares SET tree_id = %(t)s, size = %(sz)s, last_scan = now()
WHERE share_id = %(s)s;
""", {'s':share_id, 't':tree_id, 'sz': size})
}}}
Original comment by radist...@gmail.com
on 13 Apr 2010 at 10:55
>May be, we need some configurable time limit for low-level scanners?
I don't like this idea
> "UPDATE shares SET next_scan=now() WHERE tree_id=%s AND size>0" % oldtree_id,
There may be many shares with next_scan < now(). This is not a big deal and can
be fixed.
> It doesn't work when 3 spiders is running at the same time though.
Maybe quantizing shares by tree_id in the main cycle would help us?
Original comment by ruslan.savchenko
on 13 Apr 2010 at 11:00
> Maybe quantizing shares by tree_id in the main cycle would help us?
Only one share is selected each loop cycle to have a chance of running several
spider
instances. We shouldn't make assumption about the states of other shares during
scan.
For example, lookup.py could delete some of them.
> There may be many shares with next_scan < now(). This is not a big deal and
can be
fixed.
It's not a problem, typically all them will be rescanned soon. Anyway, I don't
like
that query anymore.
Original comment by radist...@gmail.com
on 13 Apr 2010 at 11:11
>We shouldn't make assumption about the states of other shares during scan.
If patching is implemented spider will have to deal with a pack of shares.
Otherwise
diverged shares would be patched from one to another without diverging them in
the
database. Discarding this problem won't do any good because this issue is a step
toward patching.
Original comment by ruslan.savchenko
on 13 Apr 2010 at 11:59
Forget it. With patching we have even more troubles. I'll leave this issue open
because this is really a problem, but patching comes first.
Original comment by ruslan.savchenko
on 13 Apr 2010 at 12:09
close
Original comment by ruslan.savchenko
on 24 Apr 2010 at 12:16
Original issue reported on code.google.com by
ruslan.savchenko
on 13 Apr 2010 at 7:28