pombreda / pysal

Automatically exported from code.google.com/p/pysal
Other
0 stars 0 forks source link

multiprocessing's map method triggers weird behaviour on W #237

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Last line in the following code causes the whole W object w to be printed:

>>> import pysal as ps
>>> import copy
>>>
>>> w = ps.lat2W()
>>> copy.deepcopy(w)

Charlie's pin pointed it to common.py, line 26.

Original issue reported on code.google.com by dreamessence on 17 Jan 2013 at 10:37

GoogleCodeExporter commented 9 years ago
Dani, thanks for finding the source of this issue.  I had this problem last 
week when using mpi4py on a cluster computer.  When I passed a W object into my 
simulator function, the whole W would print.  I "solved" the problem by 
building the W inside the function even though each simulation used the same W. 
 Commenting out line 26 would have been a better solution.

Building on Dani's example, this is what is returned when trying to 
(incorrectly) add something to the W.
>>> import pysal as ps
>>>
>>> w = ps.lat2W()
>>> w.neighbors[99] = [1,2,3]
(99, [1, 2, 3])

Maybe line 25 (in common.py) should be uncommented to give an explicit warning 
instead of the more cryptic information from line 26?  Can we give a warning to 
users when they try to edit a W object in this fashion, but do nothing when 
copy and mpi4py type libraries interact with the W?  I changed this ticket to 
"review" since the code is acting as designed.

Original comment by dfo...@gmail.com on 17 Jan 2013 at 11:39

GoogleCodeExporter commented 9 years ago
According to svn blame, Dani's response for the print statement in revision 
1086.
I believe this was changed because multiprocessing tries to make a copy of the 
ROD which for some reason calls __setitem__.

We need to implement better copy support for ROD, or drop it.

Original comment by schmi...@gmail.com on 17 Jan 2013 at 11:45

GoogleCodeExporter commented 9 years ago
I would vote for better copy support rather than dropping ROD. As Dave points 
out his use case isn't really "correct" use of W since by definition we did not 
want to allow changes in the neighbor or weight dicts after instantiation. The 
logic is if you change say w.neighbors[99] = [1,2,3] there would have to be a 
lot of overloading for the setters since now w.neighbors[1], w.neighbors[2], 
w.neighbors[3] all have to be changed otherwise there are going to be missing 
neighbor relations.

now if we rewrote the __copy__ method of ROD to be:

  return self.copy()

instead of

  return ROD(self.copy())

I'm wondering if that would fix the issues raised by multiprocessing?

Original comment by sjsrey on 18 Jan 2013 at 12:10

GoogleCodeExporter commented 9 years ago
For the record, in the multiprocessing case, I fixed it by making the W global, 
since I was only going to access it in each thread but not to modify it.

Regarding the svn blame, I think I changed the traceback to the print out in 
r1086 for my own use and must have committed it without realizing (the log 
message of the commit does not refer to it indeed). Ahh, in git this wouldn't 
have happended :-P

Original comment by dreamessence on 18 Jan 2013 at 11:20

GoogleCodeExporter commented 9 years ago
Ok, I committed a fix in r1418.  Take a look at the subtle difference between 
copy and deepcopy in the doctests.

Original comment by schmi...@gmail.com on 18 Jan 2013 at 2:10