gitter-lab / t-cell-classification

Jupyter notebooks demonstrating a microscopy machine learning image analysis workflow
BSD 3-Clause Clear License
6 stars 2 forks source link

Assess Windows compatibility in AppVeyor #3

Closed agitter closed 5 years ago

agitter commented 5 years ago

Windows builds in #2 are not running. This instead uses the AppVeyor CI service. One advantage is that AppVeyor provides pre-installed versions of Miniconda.

Due to the r-base and mro-base problems described in https://github.com/gitter-lab/t-cell-classification/pull/2#issuecomment-483619215, these tests will initially use a different conda environment with mro-base.

agitter commented 5 years ago

AppVeyor supports artifacts so we could consider converting the Jupyter notebooks to PDF or HTML, saving the artifacts, and inspecting the output to confirm everything works properly.

agitter commented 5 years ago

Initially some notebooks passed but others failed due to r-ggpubr. I need to find the right combination of conda channels that has all of the R packages in a compatible set. The first attempt didn't work:

UnsatisfiableError: The following specifications were found to be in conflict:
  - mro-base=3.5.1
  - r-ggpubr=0.2 -> r-base[version='>=3.5.1,<3.5.2.0a0'] -> _r-mutex=1[build=anacondar_1]
agitter commented 5 years ago

Trying strict channel priority gave the following error:

UnsatisfiableError: The following specifications were found to be in conflict:
  - python=3.6.8
agitter commented 5 years ago

I cannot find a combination of packages that do not have conflicting requirements for r-base and mro-base. I believe that is the root cause of this error:

UnsatisfiableError: The following specifications were found to be in conflict:
  - r-ggpubr=0.2
  - r::rpy2=2.9.4
agitter commented 5 years ago

rpy2 did not install correctly from pip, causing errors in the notebooks that require it.

agitter commented 5 years ago

Even the conda-forge versions of these two packages are not consistent:

UnsatisfiableError: The following specifications were found to be in conflict:
  - conda-forge::r-ggpubr=0.2
  - conda-forge::rpy2=2.9

The next best option may be to create a working conda environment that also has rpy2 and then install ggpubr outside of conda.

agitter commented 5 years ago

1a89d13 stops AppVeyor from building both the pull request branch and the master branch after merging. It is too slow to run both because each test is run in series.

I was using the wrong syntax to install a specific version of ggpubr, but installing from the command line does work. This is the best option until the R base, rpy2, or R packages are updated in a conda channel. We do not need to specify the ggpubr version yet because we do not lock the versions of many other R packages. We can check MRAN to see the available versions.

agitter commented 5 years ago

The neural network notebooks give the error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Miniconda36-x64\envs\t-cell-image\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Miniconda36-x64\envs\t-cell-image\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'DataGenerator' on <module '__main__' (built-in)>
xiaohk commented 5 years ago

https://github.com/gitter-lab/t-cell-classification/pull/3/commits/1445c3f4e5baabe8add49aa53d21c0408027d0b6 changes nproc (number of workers) to be system-dependent with default 1. It used to be constant 4. I use os.cpu_count() to check the number of CPU cores.

Let's see if this can fix the AttributeError error mentioned in https://github.com/gitter-lab/t-cell-classification/pull/3#issuecomment-483834977.

agitter commented 5 years ago

I can run this locally on my Windows machine now, which may help our debugging. I also encountered a different multiprocessing error in cell [8], so I think your recent updates are on the right track. The version of simple_neural_network.ipynb in 42e8efd works for me.

I did have two problems when running locally instead of in the clean environment:

agitter commented 5 years ago

We'll create a new issue to log the shortcomings of this setup: