SyneRBI / SIRF

Main repository for the CCP SynerBI software
http://www.ccpsynerbi.ac.uk
Other
58 stars 29 forks source link

Compatibility between SIRF's and CIL's DataContainer #798

Closed ClaireDelplancke closed 3 years ago

ClaireDelplancke commented 3 years ago

I would like to multiply an object of type sirf.STIR.ImageData by an object of type ccpi.framework.BlockDataContainer.BlockDataContainer where each DataContainer inside BlockDataContainer are again of type sirf.STIR.ImageData. This should give me a BlockDataContainer. Instead, I've got the following error

AssertionError                            Traceback (most recent call last)
<ipython-input-6-215be633e2e3> in <module>()
----> 1 image.multiply(imagex3)

/home/sirfuser/devel/install/python/sirf/SIRF.pyc in multiply(self, other, out)
    132             other = self.copy()
    133             other.fill(tmp)
--> 134         assert_validities(self, other)
    135         if out is None:
    136             out = self.same_object()

/home/sirfuser/devel/install/python/sirf/Utilities.pyc in assert_validities(x, y)
    394     except AssertionError as ae:
    395         raise AssertionError('Expecting same type input, got {} and {}'.format(type(x), 
--> 396                                                                                type(y)))
    397     if x.handle is None:
    398         raise AssertionError('handle for first parameter is None')

AssertionError: Expecting same type input, got <class 'sirf.STIR.ImageData'> and <class 'ccpi.framework.BlockDataContainer.BlockDataContainer'>

As per @paskino this comes from the fact that DataContainer in CIL has got a __container_priority__ member but DataContainer in SIRF doesn't.

evgueni-ovtchinnikov commented 3 years ago

the error message says it all: you cannot multiply objects of different type

if ccpi.framework.BlockDataContainer.BlockDataContainer contains sirf.STIR.ImageData objects, then you need to multiply each of them by your sirf.STIR.ImageData object in a loop

ClaireDelplancke commented 3 years ago

Sure, but I would like to be able to do it! For example I need it in order to use MixedL21Norm implemented in CIL.

evgueni-ovtchinnikov commented 3 years ago

I am afraid I may not be able to help with that, better ask CCPi people to implement such multiplication for ccpi.framework.BlockDataContainer.BlockDataContainer

paskino commented 3 years ago

The problem is that both sirf.STIR.ImageData (or better sirf.DataContainer and ccpi.framework.BlockDataContainer have a multiply (and all other algebra methods).

This means that in this situation

sid1 = sirf.STIR.ImageData()
sid2 = sirf.STIR.ImageData()
sid3 = sirf.STIR.ImageData()

bdc = BlockDataContainer(sid1, sid2)

out = sid3 *  bdc

Python will use multiply from the sirf.STIR.ImageData which doesn't know anything about BlockDataContainer, resulting in your error.

On the other hand BlockDataContainer will defer the actual multiply to sid1, sid2, sid3 in a loop, exactly as @evgueni-ovtchinnikov suggests. So one needs to tell Python to use BlockDataContainer's multiply instead. This is done by the member __container_priority__ that is in BlockDataContainer and in ccpi.framework.DataContainer. Python will choose the method based on the highest value of the __container_priority__ member.

So, adding the same __container_priority__ = 1 in sirf.DataContainer should fix this issue.

I don't see any side effect in SIRF.

evgueni-ovtchinnikov commented 3 years ago

sorry, I do not like this at all, DataContainer is an abstract class, it should not depend on the usage context (lest we end up covering more and more contexts)

better use hasattr in your code to find out whether __container_priority__ is present

KrisThielemans commented 3 years ago

Relevant CIL lines are at https://github.com/vais-ral/CCPi-Framework/blob/2041fdc9651e1ae3e26cc50e46d47669dc918c1e/Wrappers/Python/ccpi/framework/framework.py#L856-L859

can't say I understand it...

ClaireDelplancke commented 3 years ago

me neither...

paskino commented 3 years ago

Thanks @KrisThielemans !!!

The idea is that DataContainer asked to do algebra with something else will first check if this something else has __container_priority__ and then compare it with it's own. It will then instruct Python to use the method from the class that has highest __container_priority__.

This is to prefer BlockDataContainer algebra wrt DataContainer.

It seems that my suggestion isn't sufficient and will require that the same if statement pointed by @KrisThielemans to be called in each algebra method of SIRF DataContainer.

The alternative is to return NotImplemented if the object in question cannot do algebra with BlockDataContainer instead of raising an error. In this way Python, before crashing will attempt to use the other viable method (the one in BlockDataContainer.


class A(object):
    def __mul__(self, other):
        if not isinstance(other, A):
            return NotImplemented
        print ("this is A.__mul__")

class B(object):
    def __rmul__(self, other):
        print ("this is B.__rmul__")

if __name__ == '__main__':
    a = A()
    b = B()
    a * b

prints this is B.__rmul__ I suppose the latter could be implemented, but not in the short term.

KrisThielemans commented 3 years ago

I agree with @evgueni-ovtchinnikov that putting in all these if statements in SIRF is a bit weird. However, it is (strongly) motivated by our integration with CIL. It is a growing problem of course. We cannot really "expect" CIL to put specific SIRF work-arounds in either...

After reading https://stackoverflow.com/questions/5181320/under-what-circumstances-are-rmul-called, I think that __mul__ should not raise an error if the 2nd arg isn't supported, but return NotImplemented. It's arguably a bug otherwise.

Of course, this doesn't resolve the problem, until CIL.BlockDataContainer has a __rmul__. I think this is a much cleaner solution though then the priority work-around.

Possibly letting CIL.BlockDataContainer.__rmul__ just do the same as __mul__ might do the trick, but I'll leave that for someone else to try...

Should CIL's binary_operation not return NotImplemented as well?

Unless someone can try the rmul stuff in CIL soon, I think we would need to put in the (somewhat horrible) if/container_priority stuff in now anyway.

(After reading-up on Python and overloading, I've started liking it a little bit less... Although there's now @overload)

paskino commented 3 years ago

I originally used the __container_priority__ class member mimicking NumPy's __array_priority__.

Should CIL's binary_operation not return NotImplemented as well?

I think so too, https://github.com/vais-ral/CCPi-Framework/issues/645

I realised yesterday that raising NotImplemented is a cleaner solotion. However, this means that we need to update all of CIL and we can't do it right now. I tried to argue as @KrisThielemans that temporarily using the __container_priority__ class member in SIRF is the preferred workaround atm. Though it's a bit verbose, I don't think it'll change SIRF's functionality.

Possibly letting CIL.BlockDataContainer.__rmul__ just do the same as __mul__ might do the trick, but I'll leave that for someone else to try...

BlockDataContainer.__rmul__ does not do anything special, for each element in the block it just calls its __mul__ with the object you want to multiply it with. It's a very inclusive method that will just make use of each object's ability to perform the multiplication, and that's why it (should) works with SIRF and CIL DataContainer indipendently.

KrisThielemans commented 3 years ago

@paskino I was wondering if do you have an rmul already, maybe we don't need a change at all in CIL. Worth a try?

paskino commented 3 years ago

Yes, I was just thinking that if SIRF will return NotImplemented when trying to do algebra with objects that aren't supported by its own classes, it should be sufficient. CIL will still be using its __container_priority__ trick and will anyway defer to BlockDataContainer the multiplication (or any other algebra).


class A(object):
    __container_priority__ = 1
    def __mul__(self, other):
        if hasattr(other, '__container_priority__') and \
           self.__class__.__container_priority__ < other.__class__.__container_priority__:
            print ("hasattr")
            return other.__rmul__(self)
        print ("this is A.__mul__")

class B(object):
    __container_priority__ = 10
    def __rmul__(self, other):
        print ("this is B.__rmul__")

class C(object):
    def __mul__(self, other):
        if not isinstance(other, C):
            return NotImplemented

        print ("this is C.__mul__")

if __name__ == '__main__':
    a = A()
    b = B()
    a * b
    c = C()
    c * b

outputs

hasattr
this is B.__rmul__
this is B.__rmul__
KrisThielemans commented 3 years ago

excellent. @evgueni-ovtchinnikov can you therefore create a SIRF PR return NotImplemented as suggested above as opposed to throwing an error?

evgueni-ovtchinnikov commented 3 years ago

@KrisThielemans: in __mul__, __rmul__ and __div_ or __rmul__ only?

KrisThielemans commented 3 years ago

My feeling is that this needs to happen for all binary operations. See e.g. https://stackoverflow.com/questions/51298551/python-operator-overloading-radd-and-add

paskino commented 3 years ago

It needs to be implemented for all binary operations.

DANAJK commented 3 years ago

Overloading can hide complexity that is not interesting to the user, but it can also make the code path very difficult to follow. If you think that the user, or future debugger, can only understand what is happening if they know the detailed subtleties of how Python decides what to do, then I suggest you do not overload.

One other comment, can you make it such that if an overload is "detected" the actual functionality is performed in a method with a clear name e.g. SirfCilMul. That way, programmers can also call that method directly without all the nested thought processes needed for overloading?

paskino commented 3 years ago

Just for completeness:

class C(object):
    def __mul__(self, other):
        if not isinstance(other, C):
            print ("not compatible")
            return NotImplemented

        print ("this is C.__mul__")
class D(object):
    def __mul__(self, other):
        if not isinstance(other, C):
            print ("not compatible")
            return NotImplemented

        print ("this is D.__mul__")

if __name__ == '__main__':
    c = C()
    d = D()
    c * d

returns

c * d
TypeError: unsupported operand type(s) for *: 'C' and 'D'

We have never raised TypeError, but only NotImplemented. Python tried to find an appropriate method to perform the * and it failed: both C and D have a __mul__ method but no __rmul__. This means that when invoking C.__mul__ and raising NotImplemented Python went to see if D.__rmul__ was available, and failing it returned TypeError. which is nice as we now know that there's not * properly defined for the objects.

@DANAJK , I don't understand your point. You seem to imply we shouldn't use object oriented programming?

Also the point of all this is not to create a specific method to multiply SIRF with CIL objects, but let Python figure it out othewise we can't use any CIL algorithm. A CIL BlockDataContainer (BDC) is something more that a simple list of containers. First of all it can contain any container (like SIRF's) and it will allow you to multiply a BDC with something else by deferring the multiplication (all algebra really) to the objects it contains. So in the end it will call SIRF or CIL algebra methods depending on what it contains.

KrisThielemans commented 3 years ago

still need to fix addition and other operations, but we've merged the multiplication and division already. This should prevent problems with CIL current algorithms.

KrisThielemans commented 3 years ago

probably needs more unit tests but functionality should be there