Closed GoogleCodeExporter closed 8 years ago
I believe this is being triggered by the cache purge (which is done as part of
the autosave processing). Autosave is 5 min. by default and main memory is
purged of any projects which haven't been modified in the last hour. The
modification time isn't updated until the operation completes and is posted to
the history, so any long running operation has the potential to have its
Project object (which it's in the process of using) deleted from memory in the
middle of the operation.
I've applied a fix in r2222 which I think should cure the problem (not only for
this, but all long running operations). Please give it a try and let me know
how you make out.
Original comment by tfmorris
on 28 Aug 2011 at 10:02
Original comment by tfmorris
on 28 Aug 2011 at 10:10
So, does this turn off ANY writes to disk, until a Fetch URLs operation
completes ? If not, what types of writes to disk would still happen, even
after this patch (logging, anything else) ? Asking so I can test further,
later on.
Original comment by thadguidry
on 29 Aug 2011 at 1:18
Pulled new code via SVN, cleaned, build, ran Fetch URLs operation on 2400
MySpace urls, which ran for about 1.5 hours (unattended) but I did watch it up
to 90%, walked away, when I returned to look at the Ubuntu terminal (20 mins
later), I saw the Fetch1 column created, but no data for any rows? I quickly
captured the Log (attached here). I then did a Shift F5 refresh, and no data
in column showed. I stopped Refine and restarted with same parameters and
opened Firefox again, still no data showed in Fetch1 and a isBlanks facet
showed True on all 2400 rows. (sigh) Back to the drawing board?
Original comment by thadguidry
on 29 Aug 2011 at 5:17
Attachments:
I could have sworn a replied to comment 3, but it looks like it never posted.
Writes aren't affected at all by the change, the only thing that has changed is
that projects with long running processes aren't purged from memory.
Having said that, I don't think the project is marked as modified until the
operation completes and it won't get autosaved unless its modified.
re: comment 4 were there any errors in the UI? You mentioned something about
using facets before. Was this test done with a facet or facets active? If so,
can you try a smaller test with and without facets to see if that changes the
behavior? I'll look to see if there is a path where the operation can
terminate without making the error visible.
Original comment by tfmorris
on 30 Aug 2011 at 8:02
No facets used on the Comment 4 testing. I am strictly testing the Fetch
operation to reach a 100% completion and also see the "CellAtRow
fetch(CellAtRow urlData)" at line 253 of
ColumnAdditionByFetchingURLsOperation.java actually create "new Cell" in the
column. It definitely creates the column on 100% completion... however it
doesn't appear to be inserting the stream into each new Cell...for some weird
reason. That is the general failure that I am seeing.
I would categorize this as 2 issues, I think?
1. Fetch doesn't always reach 100% completion, No error shown, No new column of
data.
2. Fetch DOES reach 100% completion, DOES Create a new column, however no
stream data appears in the column's new cells (such as raw HTML).
Original comment by thadguidry
on 30 Aug 2011 at 8:25
I have tried 2 more rounds of testing with Fetch URLs.
1. Fetch 2000 rows of unique URIs from "test A" domain. (about 2 hours)
2. Fetch 4500 rows of unique URIs from "test A" domain. (about 3 hours)
Both tests completed and full HTML content was added into a new column.
Both tests used 2500ms delay. (quite harsh 2.5 sec expectation on returning a
page result)
Both tests used multiple facets to filter down to those particular rows.
My conclusion is that this seems now fixed, and that my previous concern seems
isolated to some sort of MySpace problem, or perhaps just too aggressive with
the delay during testing against the MySpace domain. The different "test A"
domain is also well known, and exhibited no such problems at all with using
current Trunk r2234 (after Tom and David's patches).
We may want to leave open until after the 2.5 public beta testing. And if no
further issues, then close.
Original comment by thadguidry
on 7 Sep 2011 at 6:35
Based on Thad's additional testing and my analysis of the problem, I'm pretty
convinced this is fixed.
Original comment by tfmorris
on 22 Sep 2011 at 10:34
Original comment by dfhu...@google.com
on 9 Oct 2011 at 5:22
Original issue reported on code.google.com by
thadguidry
on 25 Aug 2011 at 5:03Attachments: