I2PC / scipion

Scipion is an image processing framework to obtain 3D models of macromolecular complexes using Electron Microscopy (3DEM)
http://scipion.i2pc.es
Other
76 stars 47 forks source link

recode time consuming loop in file protocol_particles.py #1857

Open rmarabini opened 5 years ago

rmarabini commented 5 years ago

In file protocol_particles.py the loop:

        for micKey, mic in micDict.iteritems():
            if counter % 50 == 0:
                b = datetime.now()
                print b-a, 'reading coordinates for mic number', "%06d" % counter
                sys.stdout.flush()  # force buffer to print
            counter += 1

            micId = mic.getObjId()
            coordList = []
            self.debug("Loading coords for mic: %s (%s)" % (micId, micKey))
            for coord in coordSet.iterItems(where='_micId=%s' % micId):
                # TODO: Check performance penalty of using this clone
                coordList.append(coord.clone())

is very inefficient should be rewritten as

         for coord in coordSet.iterItems(orderBy='_micId',
                                         direction='ASC'):
             micId = coord.getMicId()
             if micId != lastMicId:
                 lastMicId = micId
                 ..
             ..

If no indexes are used there are two orders of magnitude between both ways to write the code

pconesa commented 5 years ago

Peter, @the-best-elephant, tried to address this but we could not get any significant improvement, not even comparing the runs with sqlites without indexes. I'm excluding this from the release.