radical-cybertools / radical.saga

A Light-Weight Access Layer for Distributed Computing Infrastructure and Reference Implementation of the SAGA Python Language Bindings.
http://radical-cybertools.github.io/saga-python/
Other
83 stars 34 forks source link

"fix" container add #805

Closed andre-merzky closed 4 years ago

andre-merzky commented 4 years ago

At some point, task membership in a list seems to have broken - I have no idea how and why. Note that this also holds for a plain Python list (so, when disabling the TASK attribute and adding self.tasks = list() in __init__. the tasks do have different object IDs. No idea how in is implemented (and don't care right now)

mtitov commented 4 years ago

for the method add (L346) I have the following proposal:

task_in_tasks = next(filter(lambda t: t._id == task._id, self.tasks), None)
if task_in_tasks is None:
    self.tasks.append(task)

later there is a similar approach in the method remove (L357)

task_in_tasks = next(filter(lambda t: t._id == task._id, self.tasks), None)
if task_in_tasks is not None:
    self.tasks.remove(task_in_tasks)  # ! should be `remove` not `delete`

BTW later there is a reference to task's id (L585), I think it is a mistake, there should be t._id not t.id, and the similar approach described above could be applied:

task_in_tasks = next(filter(lambda t: t._id == id, self.tasks), None)
if task_in_tasks is not None:
    return task_in_tasks
andre-merzky commented 4 years ago

Sure, we can work around this - but I would like to rather understand why the hell this doesn't work anymore. It for sure did work in the past, but I wasn't able to bisect yet (gave up because I was simply too frustrated).

For the t.id: good catch! This will only work if the task is a job (rs.job.Job inherits from rs.task.Task, and it has an id attribute, the job id). I'll add an id to the task (which is useful either way)

mtitov commented 4 years ago

we still can keep a note for that, just I don't like idea to add every task, since don't know how duplicates can mess the workflow (+ remove method also should be update) p.s. if to add every task, will that remove method delete all that copies?

mtitov commented 4 years ago

May be keep ID at Job level? (it is not that needed at Task level)

andre-merzky commented 4 years ago

Compare

#!/usr/bin/env python3

o1 = object()
o2 = object()
l  = list()

if o1 not in l: l.append(o1)
if o2 not in l: l.append(o2)

assert(len(l) == 2)

with

#!/usr/bin/env python3

d1 = dict()
d2 = dict()
l  = list()

if d1 not in l: l.append(d1)
if d2 not in l: l.append(d2)

assert(len(l) == 2)

edit: removed flaming

So, list membership is determined based on value for some builtin types, such as dict and lists, and unfortunately also for all objects which, directly or indirectly, inherit from the base type (rs.Task -> rs.Attributes -> ru.DictMixin -> dict). I vote for simply documenting that tasks can be added twice, and it is the user's responsibility to keep track of tasks added.

mtitov commented 4 years ago

As I see it now: before provided solution worked fine, since class rs.Task was object-origined (wasn't inherited from builtin classes), but it changed with the latest updates: class DictMixin(dict) (as an example with object as it was before, and as with dict - now). In this case we can add the following method to the class rs.Task:

def __eq__(self, task):
   return self._id == task._id

and no need to change methods add and remove, but method get_task has to be updated (at least with if t._id == id:, but my proposal might be faster ;) very subjective point of view)

andre-merzky commented 4 years ago

Yeah, that looks good indeed - nice! :-)

andre-merzky commented 4 years ago

@mtitov : I commit a version close to your proposal - can you have a look, please? Thanks!

mtitov commented 4 years ago

added couple fixes (id-related), it looks good to me now

andre-merzky commented 4 years ago

Thanks! :-)