Closed GoogleCodeExporter closed 9 years ago
There was a livelock in the tile loading.
Basically the file loader initialized network loader and called the callback
function, that initialized repaint and again started the file loader worker and
so on.
Made file loader worker not to call the callback function if the task is
deferred to
the network loader.
Also did some cleanup on the messy code in OpenStreetMapAsyncTileProvider.
Original comment by viesturz
on 6 Apr 2010 at 3:26
Attachments:
Note that #15 latest patch also contains this patch.
Original comment by viesturz
on 12 Apr 2010 at 7:27
I had a look at the patch and I have some questions/comments.
1. You changed the comment from "move it to the front of the queue" to "move it
to the BACK of the queue".
This is supposed to be a LRU queue, ie the most recently accessed should be the
first item off the queue. Do
you think it's not working that way?
2. In TileLoader.run you call finishTileLoad in the catch and the finally. Is
that intentional? It seems
more logical that you should call it in the success and catch cases, but not
the finally.
3. You call mPending.containsKey and then mPending.put. That seems to be
rather inefficient since the return
value of put tells you whether it previously existed. Perhaps that also makes
the synchronized block
redundant.
4. The return value of loadMapTileAsync is not used.
5. Why did you change the value of mPending from Object to
IOpenStreetMapTileProviderCallback? The value is
not used.
6. Using mWorking probably makes the code less messy because we don't need to
bother about
ConcurrentModificationException, but doesn't it also make it less efficient?
These are just my immediate comments. I haven't looked at your solution in
detail. So to summarise, I can
believe that there might be a livelock because everything is trying to process
the most recently accessed tile
first, but perhaps there's a simpler solution - perhaps just a one-liner to
"make file loader worker not call
the callback function if the task is deferred to the network loader".
Regards
Neil
Original comment by neilboyd
on 13 Apr 2010 at 10:57
1. I read the documentation and actually tested the LinkedHashMap, it stores
elements
in "access order", the more recently accessed elements are last in the order
(the
ones we want to download first).
2. That is not intentional, haven't changed that code.
3. the return value of put returns the previous value associated with that key.
Null
value is a valid value to be associated (caller passes null as the callback ).
The
only way to be sure is check containsKey.
4. yes, I had an idea to use it but apparently it is not valid anymore. Can
safely
remove the return value.
5. I was planning to remove the callback member from the TileLoader calss, as
it is
passed around uselessly in all the constructors and store in the map instead.
But it
did not work out as the FilesystemProvider actually needed it. Can be safely
reverted
to the old style (also eleminiates point 3).
6. don't know how fast the android locking code is, but
ConcurrentModificationException is just a way to show developers there are
errors in
the code, not a way to test for modifications. Also the speed is not relevant
here.
Remote service calling overhead is surely WAY more than this.
I can make a 3 lines patch for this, but I wanted to make the code a bit more
readable.
The original code had all kinds of possible concurrency problems apart from the
apparent livelock. Each tile could be actually downloaded twice, because would
be
reinserted in the download queue while the first download is in progress.
I will sumbit a new patch addressing most of the mentioned problems.
Original comment by viesturz
on 13 Apr 2010 at 11:26
Improved patch.
Original comment by viesturz
on 13 Apr 2010 at 11:50
Attachments:
Yes you're right, the iterator returns the least recently accessed items first.
But
that was not my intention, so I'll fix it so that gets the most recently
accessed
first.
And you're also right about needing the mWorking list to prevent items in
progress
being re-added to the list.
Surprising that my original version worked at all ;-)
Original comment by neilboyd
on 13 Apr 2010 at 3:22
Revision 127 contains a fix based on this patch.
Original comment by neilboyd
on 13 Apr 2010 at 9:00
Comments about the rev 127:
0) This still does not address the initial livelock issue.
1) Constructing new stack in nextTile method is not nice. Makes lots of
allocations
for garbage collector. At least make the initial size as big as the mPending.
2) in TileLoader.run the finishTileLoad is not called if there is an error (line
#143). As result it is never removed from the mPending and mWorking maps.
Original comment by viesturz
on 14 Apr 2010 at 9:18
3) No synchornization. Ons sync block does no good, if all the other access
places
are not synced.
Original comment by viesturz
on 14 Apr 2010 at 9:31
0) Doesn't it? What else is needed? mWorking should be helping.
1) I know. I would like to extend LinkedHashMap and add a reverse iterator, but
all the useful
stuff is private. Anyone got a better idea? Stack only has a default
constructor.
2) I'm working on a new version which I'll check in later. I also forgot to not
redraw when the fs
delegates to the downloader. Instead of using the dummy return value you had,
I've added another
callback for the internal completion.
Original comment by neilboyd
on 14 Apr 2010 at 10:00
3) I didn't think the synchronisation around the clear/remove of
mPending/mWorking was
necessary because so long as they're done in that order it won't go wrong. It
doesn't
matter if an item is in mWorking when it's already been removed from mPending.
Original comment by neilboyd
on 14 Apr 2010 at 10:03
The map methods are NOT atomar operations, I bet there are at least 3 places in
each
command that can go wrong if not propery synced. I took me about 1 min of
browsing
around the map in emulator to get a force close. Added proper syncs and no more
problems.
You have misunderstood the usage of mPending (maybe should use a better name
for it)
- it contains BOTH the pending and working tiles.
Original comment by viesturz
on 14 Apr 2010 at 10:55
0) The real livelock fix is in the few changed lines in
OpenStreetMapTileFilesystemProvider and the corresponding processing in
finishTileLoad (that you have so nicely removed).
1) you cold use my original implementation that just iterates all the items in
map
and takes the last one.
2) yes, that would fix the livelock.
Original comment by viesturz
on 14 Apr 2010 at 11:14
Revision 133 contains some more changes.
Original comment by neilboyd
on 15 Apr 2010 at 9:04
It gets better ;)
One more note - you should also clear mWorking in
OpenStreetMapAsyncTileProvider line 45.
Original comment by viesturz
on 15 Apr 2010 at 9:57
Also how about fixing bug #15?
Original comment by viesturz
on 15 Apr 2010 at 10:20
I'm looking at it now. It's a bit difficult because the patch replaces all
lines so
it's hard to see the differences. That's because you've replaced LF with CRLF.
Original comment by neilboyd
on 15 Apr 2010 at 10:26
Are we finished with this now?
Original comment by neilboyd
on 15 Apr 2010 at 8:00
I will test it. But current trunk does not compile :( (rev 138).
Original comment by viesturz
on 16 Apr 2010 at 6:41
Sorry, it compiles OK, Eclipse had not picked up all the changes.
Original comment by viesturz
on 16 Apr 2010 at 6:42
Original comment by neilboyd
on 10 Sep 2010 at 2:34
Original issue reported on code.google.com by
viesturz
on 6 Apr 2010 at 3:11