dieterich-lab / rp-bp

Rp-Bp is a Bayesian approach to predict, at base-pair resolution, ribosome occupancy and translation.
MIT License
7 stars 5 forks source link

macos ci #126

Closed lkeegan closed 1 year ago

lkeegan commented 1 year ago

The dev_dependencies branch tries to:

- allow python 3.6-3.9 in setup.cfg
- use dev-ssciwr branch of pbiotools
  - temporarily add all of its conda deps to our environment here
  - then install using --no-deps to avoid bringing in pypi dependencies
  - once pbiotools is published on bioconda can switch to using that & remove these conda deps
- use bioconda for all other rp-bp deps
  - currently no version pins except pystan < 3
  - this is hopefully a similar setup to the eventual bioconda recipe for rp-bp
  - once migration -> pystan 3 is done should also be able to allow python 3.10
- add macos to CI now that flexbar is available for mac on bioconda

Linux tests pass on CI, but macos segfaults on one of the calls to STAR: https://github.com/lkeegan/rp-bp/actions/runs/2902072541

Could this be related to this earlier warning?:

!!!!! WARNING: --genomeSAindexNbases 14 is too large for the genome size=15072434, which may cause seg-fault at the mapping step. Re-run genome generation with recommended --genomeSAindexNbases 10

eboileau commented 1 year ago

It is using STAR version: 2.7.10a, right? I'll check that.

lkeegan commented 1 year ago

yes:

star                      2.7.10a              h527b516_0    bioconda
eboileau commented 1 year ago

There has been quite a few STAR versions in between, so indeed we might have issues with some parameters, etc.

The simplest thing to do is to follow the recommendation, and use --genomeSAindexNbases 10 at the index creation step. This seems related to the small genome size. The problem, however, is that rpbp currently only allows to pass options to STAR in creating the profiles, but not when preparing the genome. I'll have to make changes in the code, and then pass the option in the test.

eboileau commented 1 year ago

I added STAR options (genome) to prepare_rpbp_genome.py, and --genomeSAindexNbases 10 to confest.py (commit https://github.com/dieterich-lab/rp-bp/commit/50f8ece98abc255c5f64a8433048261d17541fcc), to accommodate for the small genome size of the example. This should now work https://github.com/dieterich-lab/rp-bp/actions/runs/2903325401

Note: External programs (STAR, Flexbar) option handling needs to be improved though... this is brittle now, e.g. if using --use-slurm, we need to escape them when passing the arguments --star-options \"--genomeSAindexNbases 10\", but if we do without --use-slurm, then this does not work. It largely depends on the shell, as it may strips the quotes, and the behaviour is expected to be different when running run_rpbp_pipeline.py, as it also pre-processes the option string... (see pbiotools pgrm_utils.py).

lkeegan commented 1 year ago

Looks like it may be an issue with STAR, there are various segfault fixes to STAR in the github repo after the 2.7.10 release which are not on bioconda, and downgrading to the previous 2.7.9 version seems to resolve this issue, although the macos tests still fail but at a later point: https://github.com/lkeegan/rp-bp/runs/7952752406?check_suite_focus=true

eboileau commented 1 year ago

I have 2.7.10a installed, and I don't experience any issues, but I'm fine with fixing the version to 2.7.9a for now.


The error is different

estimate-metagene-profile-bayes-factors: error: argument --periodic-models: expected at least one argument

and is due to failed model compilation at step install rp-bp with --no-deps option

  INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_77987227f3abe322d843fe958b86c59f NOW.
  /usr/local/miniconda/envs/rpbp-dev/lib/python3.7/site-packages/setuptools/dist.py:760: UserWarning: Usage of dash-separated 'long-description' will not be supported in future versions. Please use the underscore name 'long_description' instead
    % (opt, underscore_opt)
  /usr/local/miniconda/envs/rpbp-dev/lib/python3.7/site-packages/setuptools/dist.py:760: UserWarning: Usage of dash-separated 'maintainer-email' will not be supported in future versions. Please use the underscore name 'maintainer_email' instead
    % (opt, underscore_opt)
  Traceback (most recent call last):
    File "/usr/local/miniconda/envs/rpbp-dev/lib/python3.7/site-packages/setuptools/_distutils/unixccompiler.py", line 174, in _compile
      extra_postargs)
    File "/usr/local/miniconda/envs/rpbp-dev/lib/python3.7/site-packages/setuptools/_distutils/ccompiler.py", line 917, in spawn
      spawn(cmd, dry_run=self.dry_run, **kwargs)
    File "/usr/local/miniconda/envs/rpbp-dev/lib/python3.7/site-packages/setuptools/_distutils/spawn.py", line 69, in spawn
      "command %r failed with exit code %s" % (cmd, exitcode))
  distutils.errors.DistutilsExecError: command '/usr/local/miniconda/envs/rpbp-dev/bin/x86_64-apple-darwin13.4.0-clang' failed with exit code 254

This is a macOS (version?) specific issue, apparently related to xcode commandline tools/clang compiler. There are various hints, but this one seems related to Stan.

lkeegan commented 1 year ago

I see - maybe it would make sense to first update rp-bp to use pystan3 and see if this issue is still there before trying to fix this?

eboileau commented 1 year ago

Indeed, that's what I also think.

eboileau commented 1 year ago

But I would first need to check https://pystan.readthedocs.io/en/latest/upgrading.html#upgrading

eboileau commented 1 year ago

There shouldn't be major issues, at least syntactically, but clearly I would need to test it. I will try to address that as soon as possible (I have some other deadlines coming...)

lkeegan commented 1 year ago

Sure, no urgency from my side! A related improvement (as long as you don't need to compile new stan models after installation) could be to do the pystan compile step as part of the bioconda package build process instead of as part of the user install, which would:

eboileau commented 1 year ago

The main obstacle to distributing a compiled Stan model is that it needs to be compiled for different platforms separately. So typically, the simplest way around this problem involves compiling (and saving) the models during installation, as we currently do.

lkeegan commented 1 year ago

Sure, but that is essentially what conda is: a bunch of software pre-compiled for different plaforms - so we could pre-compile the stan models for each platform on bioconda in the same way that e.g. flexbar is compiled for each platform. (It would get more involved if we need to also do this for each python version, that depends I guess on exactly how pystan interacts with stan). I agree what you currently do is also fine!

eboileau commented 1 year ago

@lkeegan please let me know what is missing to resolve this issue? The CmdStanPy interface is now implemented, and the models are installed/compiled in a default location under the conda environment, which is, I think, the best option, since it isolates them from any other installation/environment.

lkeegan commented 1 year ago

@eboileau sounds good - adding CI for macos and testing different python versions (as done in this now out-of-date branch https://github.com/lkeegan/rp-bp/tree/dev_dependencies) is still missing. This was paused as the CI failed on macos but hopefully this has been fixed by the move to cmdstanpy - I can update this branch and make a PR in Jan.

eboileau commented 1 year ago

@lkeegan I think this is sorted, right?