NiklasTR commented 5 years ago

Hi Allen team I am reopening the issue below - I did not find the solution.

I am running the release_1 branch on a custom dataset for the first time on an Ubuntu18 AWS P3 instance. I created a conda enviromnent acc. to the instructions and had a successful run of the example script.

When starting the training of a new dataset with the following shell scripts: source scripts/train_model_example.sh 000012049003_test 0 I receive the following error: RuntimeError: sequence argument must have length equal to input rank

I tried a fix proposed in #102, but without success.

This is my csv file: path_signal,path_target bucket/flatfield/0000120489032019-02-06T20_33_15-Measurement_2/0000120489032019-02-06T20_33_15-Measurement_2-sk1-A01-f01-ch2/r01c01f01p07-ch2sk1fk1fl1_flatfield.tiff,bucket/flatfi$ bucket/flatfield/0000120489032019-02-06T20_33_15-Measurement_2/0000120489032019-02-06T20_33_15-Measurement_2-sk1-A02-f01-ch2/r01c01f01p07-ch2sk1fk1fl1_flatfield.tiff,bucket/flatfi$ bucket/flatfield/000012048903__2019-02-06T20_33_15-Measurement_2/000012048903__2019-02-06T20_33_15-Measurement_2-sk1-A03-f01-ch2/r01c01f01p07-ch2sk1fk1fl1_flatfield.tiff,bucket/flatfi$

and here is my shell script:

!/bin/bash -x

DATASET=${1:-dna} N_ITER=50000 BUFFER_SIZE=30 BATCH_SIZE=24 RUN_DIR="saved_models/${DATASET}" PATH_DATASET_ALL_CSV="data/csvs/${DATASET}.csv" PATH_DATASET_TRAIN_CSV="data/csvs/${DATASET}/train.csv" GPU_IDS=${2:-0}

cd $(cd "$(dirname ${BASH_SOURCE})" && pwd)/..

python scripts/python/split_dataset.py ${PATH_DATASET_ALL_CSV} "data/csvs" --train_size 0.75 -v python train_model.py --n_iter ${N_ITER} --class_dataset TiffDataset --path_dataset_csv ${PATH_DATASET_TRAIN_CSV} --buffer_size ${BUFFER_SIZE} --buffer_switch_frequency 2000000 --batch_size ${BATCH_SIZE} --path_run_dir ${RUN_DIR} --gpu_ids ${GPU_IDS}

NiklasTR commented 5 years ago

Reopened #119 after misunderstanding with @counkomol

counkomol commented 5 years ago

For the record, we generally cannot support external users that are trying to run custom datasets. There is simply no way to debug code we cannot see, and we also cannot afford the time resources.

With that said, it seems like the scipy.ndimage.zoom call is the issue. I recommend you figure out what the function does and what are the expected input and output. Take a look at this example below and make sure you understand why there is an error message after the second call of zoom:

>>> from scipy.ndimage import zoom
>>> import numpy as np
>>> ar = np.random.random((2, 3, 4))
>>> zoom(ar, (1, 1, 1))
array([[[0.64388984, 0.14695717, 0.4001792 , 0.64723728],
        [0.45446395, 0.46000497, 0.97303431, 0.47801983],
        [0.69656194, 0.09671482, 0.52760002, 0.61470544]],

       [[0.17250034, 0.83149525, 0.25708348, 0.87488021],
        [0.8553214 , 0.39795458, 0.42644317, 0.37339944],
        [0.83100132, 0.39319432, 0.17649567, 0.39479372]]])
>>> zoom(ar, (1, 1))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/chek/miniconda3/envs/fnet_dev/lib/python3.7/site-packages/scipy/ndimage/interpolation.py", line 595, in zoom
    zoom = _ni_support._normalize_sequence(zoom, input.ndim)
  File "/home/chek/miniconda3/envs/fnet_dev/lib/python3.7/site-packages/scipy/ndimage/_ni_support.py", line 65, in _normalize_sequence
    raise RuntimeError(err)
RuntimeError: sequence argument must have length equal to input rank

AllenCellModeling / pytorch_fnet

Reopening: "RuntimeError: sequence argument must have length equal to input rank" #131

!/bin/bash -x