Unclear how to combine multiple tif files

dgoodwin208 commented 9 years ago

Hi,

I've tried many permutations of the tutorial and surrounding code but I cannot figure out how to import the data from an experiment that was saved across multiple tif files.

For a list of filenames, this would make the most sense to me, but doesn't work:

sequences = []
for filename in tiff_filenames:
        sequences.append([sima.Sequence.create('TIFF', filename)])
sequences = np.array(sequences) #needs to be cast as a numpy array?? crashes otherwise
sequences = [sima.Sequence.join(sequences)]

This then crashes when I call a correct() function from a motion package:

mc_approach = sima.motion.HiddenMarkov2D(granularity='row', max_displacement=[5, 10], verbose=True)
print "running correction"
dataset = mc_approach.correct(sequences, corr_data_output_file)

Which then gives the following crash File "/nethome/goodwin/venvNJC/lib/python2.7/site-packages/PIL/TiffImagePlugin.py", line 648, in seek self.im = Image.core.new(self.mode, self.size) ValueError: unrecognized mode

Furthermore, when I try to use the sima.iterables code as shown in the tutorial, I can't import sima.iterables as (for v1.1.0 of Sima) I get 'ImportError: no module named iterables'

pkaifosh commented 9 years ago

Hi Daniel,

Can you please give me a bit more information about how your data is saved. Is there one TIFF file per channel, or one TIFF file per frame? Or something else?

Thanks!

dgoodwin208 commented 9 years ago

Hey Patrick,

Thanks for the instant response :+1:

Each tif has two interleaved channels across ~1000 time slices. I was not currently splitting the channels, assuming sima wouldn't tell the difference. That said, it would be really awesome if you could help me figure out how to ignore one of the channels too (as the second one is junk).

Could share the tif header etc. if that would help.

Really appreciate the support on this.

-Dan

pkaifosh commented 9 years ago

Hi Dan,

I'm still having a bit of trouble understanding your format. Does each file have multiple channels and multiple time points? What are the different files for? Could you explain more, or send me the TIFF headers, or upload a subset of the data somewhere where I can take a quick look?

Thanks, Patrick

dgoodwin208 commented 9 years ago

Hi Patrick,

I would have to check about permission sharing the data (it's not mine), but can at least share the header and explain it better in the meantime.

Data is split across different tiffs only because the experimentalists' software can't write tiffs over 4GB, so the data is saved in a series of 1000-frame files with two channels interleaved (so 2000 frames total per file). Below is a properties view from ImageJ on one of the datasets.

More header info below:

scanimage.SI4.acqFrameBufferLength = 9 scanimage.SI4.acqFrameBufferLengthMin = 2 scanimage.SI4.acqFramesDone = 0 scanimage.SI4.acqNumAveragedFrames = 1 scanimage.SI4.acqNumFrames = 10000 scanimage.SI4.acqState = 'idle' scanimage.SI4.beamDirectMode = false scanimage.SI4.beamFillFracAdjust = 0 scanimage.SI4.beamFlybackBlanking = true scanimage.SI4.beamLengthConstants = zeros(0,1) scanimage.SI4.beamLiveAdjust = true scanimage.SI4.beamPowerLimits = zeros(0,1) scanimage.SI4.beamPowerUnits = 'percent' scanimage.SI4.beamPowers = zeros(0,1) scanimage.SI4.beamPzAdjust = zeros(0,1) scanimage.SI4.channelsAutoReadOffsets = 1 scanimage.SI4.channelsAutoReadOffsetsOnFocus = false scanimage.SI4.channelsDataType = 'int16' scanimage.SI4.channelsDisplay = [1;2] scanimage.SI4.channelsInputRange = {[0 1] [0 1] [0 4] [0 4]} scanimage.SI4.channelsLUT = [0 537;0 554;0 8191;0 8191] scanimage.SI4.channelsMergeColor = {'green' 'red' 'gray' 'none'} scanimage.SI4.channelsMergeEnable = 1 scanimage.SI4.channelsMergeFocusOnly = 1 scanimage.SI4.channelsOffset = [160;171;37;26] scanimage.SI4.channelsReadOffsetsOnStartup = false scanimage.SI4.channelsSave = [1;2] scanimage.SI4.channelsSubtractOffset = [true;true;false;false] scanimage.SI4.displayFrameBatchFactor = 2 scanimage.SI4.displayFrameBatchFactorLock = false scanimage.SI4.displayFrameBatchSelectLast = true scanimage.SI4.displayFrameBatchSelection = 2 scanimage.SI4.displayRollingAverageFactor = 4 scanimage.SI4.displayRollingAverageFactorLock = false scanimage.SI4.displayShowCrosshair = false scanimage.SI4.errorCondition = false scanimage.SI4.errorConditionIdentifiers = {} scanimage.SI4.errorConditionMessages = {} scanimage.SI4.fastCfgAutoStartTf = [false;false;false;false;false;false] scanimage.SI4.fastCfgAutoStartType = scanimage.SI4.fastZAcquisitionDelay = 0 scanimage.SI4.fastZActive = false scanimage.SI4.fastZAllowLiveBeamAdjust = false scanimage.SI4.fastZDiscardFlybackFrames = false scanimage.SI4.fastZEnable = false scanimage.SI4.fastZFillFraction = [] scanimage.SI4.fastZFramePeriodAdjustment = -100 scanimage.SI4.fastZImageType = 'XY-Z' scanimage.SI4.fastZNumDiscardFrames = 0 scanimage.SI4.fastZNumVolumes = 1 scanimage.SI4.fastZPeriod = [] scanimage.SI4.fastZScanType = 'sawtooth' scanimage.SI4.fastZSettlingTime = 0 scanimage.SI4.fastZUseAOControl = true scanimage.SI4.fastZVolumesDone = 0 scanimage.SI4.focusDuration = Inf scanimage.SI4.frameAcqFcnDecimationFactor = 1 scanimage.SI4.galvoAngle2LSMAngleFactor = 1 scanimage.SI4.galvoEnable = false scanimage.SI4.loggingEnable = 1 scanimage.SI4.loggingFileCounter = 9 scanimage.SI4.loggingFileCounterAutoReset = true scanimage.SI4.loggingFilePath = 'E:\140708' scanimage.SI4.loggingFileStem = 'm3134A' scanimage.SI4.loggingFramesPerFile = 1000 scanimage.SI4.loggingFramesPerFileLock = false scanimage.SI4.loggingMaxFileSize = [] scanimage.SI4.loopNumRepeats = Inf scanimage.SI4.loopRepeatPeriod = 10 scanimage.SI4.loopRepeatsDone = 0 scanimage.SI4.maxFrameEventRate = 65 scanimage.SI4.motorFastMotionThreshold = 100 scanimage.SI4.motorHasMotor = true scanimage.SI4.motorHasSecondMotor = true scanimage.SI4.motorMoveTimeout = 5 scanimage.SI4.motorPosition = [-44673.1 -12490 -14949.2 0.0990319] scanimage.SI4.motorSecondMotorZEnable = false scanimage.SI4.motorUserDefinedPositions = {} scanimage.SI4.mroiEnabled = false scanimage.SI4.mroiParams = <nonscalar struct/object> scanimage.SI4.mroiPixelsPerLine = 512 scanimage.SI4.mroiUpdateMinLines = 200 scanimage.SI4.mroiZoomFactor = 1 scanimage.SI4.pmtEnable = [] scanimage.SI4.pmtGain = [] scanimage.SI4.scanAngleMultiplierFast = 1 scanimage.SI4.scanAngleMultiplierSlow = 1 scanimage.SI4.scanFOVAngularRangeFast = 13.2 scanimage.SI4.scanFOVAngularRangeSlow = 13.2 scanimage.SI4.scanFillFraction = 0.66 scanimage.SI4.scanFillFractionSpatial = 0.860742 scanimage.SI4.scanForceSquarePixel = 1 scanimage.SI4.scanForceSquarePixelation = 1 scanimage.SI4.scanFramePeriod = 0.0333416 scanimage.SI4.scanFrameRate = 29.9926 scanimage.SI4.scanFramesStarted = 0 scanimage.SI4.scanLinePeriod = 6.3147e-05 scanimage.SI4.scanLinesPerFrame = 512 scanimage.SI4.scanMinZoomFactor = 1 scanimage.SI4.scanMode = 'bidirectional' scanimage.SI4.scanParamCache = [] scanimage.SI4.scanPhase = 17 scanimage.SI4.scanPhaseFine = 0 scanimage.SI4.scanPhaseFineMap.Count = 1 scanimage.SI4.versionMajor = 4.2 scanimage.SI4.versionMinor = 0 scanimage.SI4.versionPRNumber = 1 scanimage.SI4.triggerFrameNumber = 1 scanimage.SI4.triggerFrameStartTime = 0 scanimage.SI4.triggerTime = -0.41159773

pkaifosh commented 9 years ago

OK. Now I understand. This is the first case I have heard of where someone has the data split like this. For this reason, we don't have anything built in to support this. The join() method that you tried is intended for joining separate channels, which each sequence has the information for one channel.

The easiest for you would be to make a new subclass of the Sequence class for this special case of joining multiple sequences head-to-tail. I've sketched out something for you below, with a few TODO's incomplete and not at all tested.

import itertools as it
from sima.sequence import Sequence

class ConcatenatedSequence(Sequence):

    def __init__(self, sequences):
        shape = None
        num_channels = 0
        self._sequences = sequences
        for seq in sequences:
            if shape is None:
                shape = seq.shape[1:]
            if not shape == seq.shape[1:]:
                raise ValueError(
                    'Sequences being joined must have the same number '
                    'of frames, planes, rows, and columns.')
            num_channels += seq.shape[-1]
        self._shape = (sum(s.shape[0] for s in sequences),) + shape

    def __len__(self):
        return self._shape[0]

    @property
    def shape(self):
        return self._shape

    def __iter__(self):
        return it.chain(self._sequences)  # TODO: check this

    def _get_frame(self, t):
        #TODO: figure out which frame idx to grab from which sequence
        idx = ?
        seq = ?
        return seq._get_frame(idx)

    def _todict(self, savedir=None):
        return {
            '__class__': self.__class__,
            'sequences': [s._todict(savedir) for s in self._sequences],
        }

    @classmethod
    def _from_dict(cls, d, savedir=None):
        sequences = []
        for s in d.pop('sequences'):
            seq_class = s.pop('__class__')
            sequences.append(seq_class._from_dict(s, savedir))
        return cls(sequences)

Let me know if that does the trick. If you think that more people have similarly formatted data, you could contribute a cleaned-up version of this to the repo, perhaps along with a concatenate() class method analogous to join(), or perhaps a single method with an "axis" argument to do both.

dgoodwin208 commented 9 years ago

Ok great. Most importantly, I'm not crazy ;)

I'll take this and run with it. Thanks for sketching up the solution. If it works well I'll submit a pull request.

Cheers, Dan

losonczylab / sima

Unclear how to combine multiple tif files #147