stephenslab / dsc

Repo for Dynamic Statistical Comparisons project
https://stephenslab.github.io/dsc-wiki
MIT License
12 stars 12 forks source link

conda package for dscrutils? #224

Closed aryarm closed 3 years ago

aryarm commented 3 years ago

Is there an official conda package for dscrutils? Currently, the recommended solution is to install it using devtools, but that can make it difficult to utilize dscrutils as a dependency of other conda recipes.

aryarm commented 3 years ago

I made an attempt at creating a conda recipe based on this recipe, the conda-forge template, and the DESCRIPTION file. It seems to work ok when I install it, but I'm not sure if I got all of the dependencies. I would definitely appreciate any feedback on this and would be willing to help create an official conda package if there isn't one already.

stephens999 commented 3 years ago

@aryarm thanks for being proactive! We are a little busy with a deadline right now, but someone will get back to you eventually....

gaow commented 3 years ago

@aryarm thanks for the input! I cannot claim expert on conda packages distribution (dsc recipe was done by @jdblischak ) but I can definitely try find some time (maybe after this deadline) and learn it by reviewing your code. For starters, could you send a pull request for what you have to dsc on conda-forge (@jdblischak please advise if that is the proper way to do it)?

jdblischak commented 3 years ago

Is the plan to submit dscrutils to CRAN? If not, I'd prefer not to add it to conda-forge. While you could probably get it on conda-forge, that adds more maintenance burden for the R maintainers (e.g. the conda-forge team has built tools to automatically update CRAN packages, which don't work for non-CRAN packages).

@aryarm Your conda recipe looks fine to me. I could also update my recipe and upload it to my channel if that would be helpful. Also, what is your use case for DSC? Are you using DSC/dscrutils to perform an analysis? Or are you bundling them into another software product?

pcarbo commented 3 years ago

I can say that we are not planning on submitting dscrutils to CRAN. (I can't provide much input on creating a conda recipe as I am unfamiliar with this.)

aryarm commented 3 years ago

Is the plan to submit dscrutils to CRAN? If not, I'd prefer not to add it to conda-forge.

the conda-forge team has built tools to automatically update CRAN packages, which don't work for non-CRAN packages

I can say that we are not planning on submitting dscrutils to CRAN.

Ah, ok. That makes sense, then. No worries!

It's not a major concern for me; just thought I'd ask in case I could help. Anyway, I have no personal experience with submitting packages to CRAN.

@aryarm Your conda recipe looks fine to me. I could also update my recipe and upload it to my channel if that would be helpful.

Thanks for the feedback, @jdblischak! That's much appreciated. No need to update your recipe if you don't have the bandwidth.

It seemed like there were a few dependencies that didn't get pulled in when I tried your recipe, @jdblischak. Perhaps updates to dscrutils have incorporated more packages since you last built it. But you generously provided instructions for creating my own recipe, so I went ahead and did that.

By the way, I found your repository super helpful for understanding how to create a recipe for dscrutils. Thanks for creating such a great resource! I'm sure I will continue to use it for other recipes in the future.

Also, what is your use case for DSC? Are you using DSC/dscrutils to perform an analysis? Or are you bundling them into another software product?

Sort-of both, but kinda neither. I'm trying to create a reproducible Snakemake pipeline (which we may develop into a fully-fledged tool one day). A script from my pipeline uses this script from SuSiE, which uses dscrutils. But the conda-forge package for SuSiE doesn't seem to pull in dscrutils as a dependency. (I'm guessing this was because dscrutils isn't on conda-forge, but I'm not sure.)

jdblischak commented 3 years ago

It seemed like there were a few dependencies that didn't get pulled in when I tried your recipe, @jdblischak. Perhaps updates to dscrutils have incorporated more packages since you last built it.

You're right. It looks like there were few new dependencies added since I made my initial recipe.

A script from my pipeline uses this script from SuSiE, which uses dscrutils. But the conda-forge package for SuSiE doesn't seem to pull in dscrutils as a dependency. (I'm guessing this was because dscrutils isn't on conda-forge, but I'm not sure.)

That script is in the inst/ directory. It's not officially part of the susieR package, i.e. the code in R/.

Creating a conda recipe requires choosing how many optional dependencies to include. For example, I added r-reticulate to my recipe for dscrutils since it was for my personal use, but I usually don't include Suggested dependencies in a recipe. It's usually better to let users explicitly install optional dependencies.

By the way, I found your repository super helpful for understanding how to create a recipe for dscrutils. Thanks for creating such a great resource! I'm sure I will continue to use it for other recipes in the future.

Happy to hear you found it useful!

I'm trying to create a reproducible Snakemake pipeline (which we may develop into a fully-fledged tool one day).

I recommend creating a conda environment like the one below to run SuSiE with DSC:

name: susieDsc
channels:
  - conda-forge
  - aryarm
  - nodefaults
dependencies:
  - r-dscrutils
  - r-susier
aryarm commented 3 years ago

That script is in the inst/ directory. It's not officially part of the susieR package, i.e. the code in R/.

Creating a conda recipe requires choosing how many optional dependencies to include. For example, I added r-reticulate to my recipe for dscrutils since it was for my personal use, but I usually don't include Suggested dependencies in a recipe. It's usually better to let users explicitly install optional dependencies.

Ah, ok. I had a hunch that might be the case, but I didn't really understand the difference between inst/ and R/. Thanks!

I recommend creating a conda environment like the one below to run SuSiE with DSC:

Great, thanks! Yeah, that's what I eventually went with.

Also +1 for nodefaults! 😄 imo that should be a best practice

jdblischak commented 3 years ago

I just updated my r-dscrutils conda recipe to 0.4.0: https://anaconda.org/jdblischak/r-dscrutils/files?version=0.4.0 This was the most recent version that was easy for me to build. 0.4.3.5 doesn't have a git tag, 0.4.3 had an issue with the digest package, 0.4.2 doesn't have a corresponding dsc conda package available, and 0.4.1 doesn't have a git tag.

I didn't really understand the difference between inst/ and R/

I didn't do a great job of explaining it either. Essentially R/ contains all the package functions that you can execute after installing and loading the package. The inst/ directory is a method for including any other files in your installed package. They could be anything: PDFs with documentation, scripts in other programming languages, etc.

Also +1 for nodefaults! 😄 imo that should be a best practice

100% agree!

aryarm commented 3 years ago

I just updated my r-dscrutils conda recipe

Wow, thank you so much!

0.4.3.5 doesn't have a git tag

Out of curiosity, is there any reason why you can't use url instead of git_url for situations like those? Would you consider it bad practice? For example, for dscrutils 0.4.3.5 at 7938e22b5e58c64aecf1d13a1db8296f329f5ba5:

source:
  url: https://github.com/stephenslab/dsc/archive/7938e22b5e58c64aecf1d13a1db8296f329f5ba5.tar.gz
  sha256: 67625a3fbeeee46452a1b289ba64380204de6f9ae6edd6106c8fca487bf40982

Also, RE the difference between R/ and inst/: that makes a lot of sense! Thanks for the explanation.

jdblischak commented 3 years ago

is there any reason why you can't use url instead of git_url for situations like those? Would you consider it bad practice?

You could do that. My motivations for using git tags are mostly practical:

As for best practice, it depends on the context. Is the conda binary mainly for you or do you plan on others relying on it? Is the author willing to tag it? In general a tag indicates that the software is in a stable state, and appears more intentional.