environment.yml unable to resolve dependencies

WillGiang commented 3 years ago

Hello @BPHO-Salk ,

I wanted to test PSSR on some 8nm FIB-SEM datasets and am attempting this on Ubuntu 18.04 (through WSL2), but I think I'm encountering dependency hell.

Using the provided environment.yml leads to four libraries not found $ conda env create --file=env.yml

Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:
  - regex==2018.01.10=py37h14c3975_1000
  - dataclasses==0.6=py_0
  - spacy==2.0.18=py37hf484d3e_1000
  - thinc==6.12.1=py37h637b7d7_1000

I edited the .yml and moved the four packages under the pip section, but incompatibilities persist (see PSSR-environment-yml-issues.txt for the full conflicts.

Similarly, if I add conda-forge to the list of channels, I get an UnsatisfiableError PSSR-with-conda-forge-channel.txt

Any help/suggestions would be greatly appreciated!

Will

WillGiang commented 3 years ago

I can get around this somewhat by using env-fresh-Copy.txt

but then libtiff can't be imported.

Seeing that Inference_PSSR_for_EM.ipynb only uses libtiff for loading, I commented out the import statement and replaced the first two lines in tif_predict_movie_blend_slices with data = skimage.external.tifffile.imread(tif_in)

The cells run without issue until loading the model.

<ipython-input-9-e1d63bdede98> in <module>
      1 #state = torch.load(path, map_location='cuda:0')
      2 model_name = 'PSSR_for_EM_1024'
----> 3 learn = load_learner('./stats/models', f'{model_name}.pkl').to_fp16()
      4 size = int(model_name.split('_')[-1])
      5 print(f'{model_name} model is being used.')

~/anaconda3/envs/pssr/lib/python3.7/site-packages/fastai/basic_train.py in load_learner(path, file, test, **db_kwargs)
    608     "Load a `Learner` object saved with `export_state` in `path/file` with empty data, optionally add `test` and load on `cpu`. `file` can be file-like (file or buffer)"
    609     source = Path(path)/file if is_pathlike(file) else file
--> 610     state = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source)
    611     model = state.pop('model')
    612     src = LabelLists.load_state(path, state.pop('data'))

~/anaconda3/envs/pssr/lib/python3.7/site-packages/torch/serialization.py in load(f, map_location, pickle_module, **pickle_load_args)
    385         f = f.open('rb')
    386     try:
--> 387         return _load(f, map_location, pickle_module, **pickle_load_args)
    388     finally:
    389         if new_fd:

~/anaconda3/envs/pssr/lib/python3.7/site-packages/torch/serialization.py in _load(f, map_location, pickle_module, **pickle_load_args)
    572     unpickler = pickle_module.Unpickler(f, **pickle_load_args)
    573     unpickler.persistent_load = persistent_load
--> 574     result = unpickler.load()
    575 
    576     deserialized_storage_keys = pickle_module.load(f, **pickle_load_args)

~/anaconda3/envs/pssr/lib/python3.7/site-packages/torch/serialization.py in persistent_load(saved_id)
    535                 obj = data_type(size)
    536                 obj._torch_load_uninitialized = True
--> 537                 deserialized_objects[root_key] = restore_location(obj, location)
    538             storage = deserialized_objects[root_key]
    539             if view_metadata is not None:

~/anaconda3/envs/pssr/lib/python3.7/site-packages/torch/serialization.py in default_restore_location(storage, location)
    117 def default_restore_location(storage, location):
    118     for _, _, fn in _package_registry:
--> 119         result = fn(storage, location)
    120         if result is not None:
    121             return result

~/anaconda3/envs/pssr/lib/python3.7/site-packages/torch/serialization.py in _cuda_deserialize(obj, location)
     93 def _cuda_deserialize(obj, location):
     94     if location.startswith('cuda'):
---> 95         device = validate_cuda_device(location)
     96         if getattr(obj, "_torch_load_uninitialized", False):
     97             storage_type = getattr(torch.cuda, type(obj).__name__)

~/anaconda3/envs/pssr/lib/python3.7/site-packages/torch/serialization.py in validate_cuda_device(location)
     87                            'torch.load with map_location to map your storages '
     88                            'to an existing device.'.format(
---> 89                                device, torch.cuda.device_count()))
     90     return device
     91 

RuntimeError: Attempting to deserialize object on CUDA device 1 but torch.cuda.device_count() is 1. Please use torch.load with map_location to map your storages to an existing device.

The traceback here is actually helpful--I can get around this by modifying line 610 in ~/anaconda3/envs/pssr/lib/python3.7/site-packages/fastai/basic_train.py to include map_location like torch.load(source, map_location='cuda:0') to explicitly set my only GPU

old: state = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source)

new: state = torch.load(source, map_location='cpu') if defaults.device == torch.device('cpu') else torch.load(source, map_location='cuda:0')

But then the next cell (while predicting) fails with RuntimeError: expected backend CPU and dtype Float but got backend CUDA and dtype Float

I'm not really sure what to do at this point.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-9-e7a6526bad09> in <module>
      4     orig_name = results/f'{fn.stem}_orig.tif'
      5     print(fn)
----> 6     tif_predict_movie_blend_slices(learn, str(fn), size=size, orig_out=orig_name, pred_out=pred_name )

<ipython-input-3-0521c6b7a95d> in tif_predict_movie_blend_slices(learn, tif_in, orig_out, pred_out, size)
     10             img /= img_max
     11             img = img[np.newaxis, :]
---> 12             out_img = unet_image_from_tiles_blend(learn, img, tile_sz=size)
     13             pred = (out_img[None]*255).astype(np.uint8)
     14             pred_img_out = pred_out+f'_slice{depth}.tif'

<ipython-input-4-5880e9deca7b> in unet_image_from_tiles_blend(learn, in_img, tile_sz, scale, overlap_pct, img_info)
     37             else:
     38                 img_in = Image(in_tile[:,:,0][None])
---> 39             pred, _, _ = learn.predict(img_in)
     40 
     41             out_tile = pred.data.numpy()[0]

~/anaconda3/envs/pssr/lib/python3.7/site-packages/fastai/basic_train.py in predict(self, item, return_x, batch_first, with_dropout, **kwargs)
    368         norm = getattr(self.data,'norm',False)
    369         if norm:
--> 370             x = self.data.denorm(x)
    371             if norm.keywords.get('do_y',False): raw_pred = self.data.denorm(raw_pred)
    372         ds = self.data.single_ds

~/anaconda3/envs/pssr/lib/python3.7/site-packages/fastai/vision/data.py in denormalize(x, mean, std, do_x)
     59 def denormalize(x:TensorImage, mean:FloatTensor,std:FloatTensor, do_x:bool=True)->TensorImage:
     60     "Denormalize `x` with `mean` and `std`."
---> 61     return x.cpu().float()*std[...,None,None] + mean[...,None,None] if do_x else x.cpu()
     62 
     63 def _normalize_batch(b:Tuple[Tensor,Tensor], mean:FloatTensor, std:FloatTensor, do_x:bool=True, do_y:bool=False)->Tuple[Tensor,Tensor]:

RuntimeError: expected backend CPU and dtype Float but got backend CUDA and dtype Float

WillGiang commented 3 years ago

The new environment set-up instructions work for me. Closing this issue but will include the instructions below for posterity. Thanks, Linjing!

git clone https://github.com/BPHO-Salk/PSSR.git conda create --name pssr python=3.7 conda activate pssr pip install fastai==1.0.55 tifffile czifile scikit-image pip uninstall torch conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch

BPHO-Salk / PSSR

environment.yml unable to resolve dependencies #1