kfuku52 / amalgkit

RNA-seq data amalgamation for a large-scale evolutionary transcriptomics
BSD 3-Clause "New" or "Revised" License
7 stars 1 forks source link

conda recipe #34

Open kfuku52 opened 3 years ago

kfuku52 commented 3 years ago

Currently, it's not a high-priority issue, but we would need this upon publication.

Hego-CCTB commented 3 years ago

For later reference:

https://docs.conda.io/projects/conda-build/en/latest/concepts/recipe.html

https://conda.io/projects/conda-build/en/latest/user-guide/tutorials/building-conda-packages.html

https://conda.io/projects/conda-build/en/latest/user-guide/tutorials/build-pkgs.html

kfuku52 commented 1 year ago

@Hego-CCTB I will take care of it if you don't have time.

Hego-CCTB commented 1 year ago

This is something I'd like to do myself, but I'd prioritize other issues over this. If that takes too long, though, you can take over any time you want.

kfuku52 commented 4 months ago

@Hego-CCTB Have you had a chance to look into this?

Hego-CCTB commented 2 months ago

I have created a conda recipe and tested the installation locally. Here are my steps:

This is the recipe, saved as "meta.yaml" in the working directory.

package:
  name: amalgkit
  version: "0.12.4"

source:
  git_url: https://github.com/kfuku52/amalgkit

build:
  number: 0
  script: "{{ PYTHON }} -m pip install ."

requirements:
  build:
    - python>=3.10
    - pip
  run:
    - python>=3.10
    - numpy>=2.0.0
    - biopython>=1.7
    - pandas>=2.2.2
    - lxml>=5.2.1
    - seqkit>=2.3.1
    - parallel-fastq-dump>=0.6.7
    - fastp>=0.22.0
    - kallisto>=0.48.0
    - r-ggplot2>=3.5.0
    - bioconductor-pcamethods>=1.90.0
    - bioconductor-ruvseq>=1.32.0
    - bioconductor-sva>=3.46.0
    - bioconductor-edger>=3.40.0
    - r-colorspace>=2.1_0
    - r-rcolorbrewer>=1.1_2
    - r-mass>=7.3_57
    - r-nmf>=0.21.0
    - r-dendextend>=1.16.0
    - r-amap>=0.8_19
    - r-pvclust>=2.2_0
    - r-rtsne>=0.16
    - r-patchwork>=1.2.0

about:
  home: https://github.com/kfuku52/amalgkit/
  license: MIT
  description: "AMALGKIT is a toolkit to integrate RNA-seq data from the NCBI SRA database and from private fastq files to generate unbiased cross-species transcript abundance dataset for a large-scale evolutionary gene expression analysis."

A package can then be built by using conda-build . (conda-build may need to be installed with conda install conda-build). An amalgkit.tar.bz2 file is created by the conda-build command, which can be installed from local via

conda install --use-local amalgkit 

I'm running a test set to see if everything works fine, but if it is, all that's left is to make an Anaconda account and upload the package.

Hego-CCTB commented 4 weeks ago

the kallisto version that got installed using this recipe (0.51) is causing issues during quant. I'll try to figure out the problem, but maybe I need to restrict the recipe to the latest compatible version. All other amalgkit modules work as expected using this recipe.

kfuku52 commented 4 weeks ago

Ideally, AMALGKIT should adjust its output parsing logic based on the installed version of Kallisto, but please assess the feasibility of implementing this functionality.