guix-science / guix-science-nonfree

Non-free scientific packages for GNU Guix.
16 stars 12 forks source link

Help packaging GATK-4.4.0.0 #5

Open stkgo opened 1 year ago

stkgo commented 1 year ago

Could someone help me package GATK 4.4.0.0? Here is what my package definition looks like thus far:

(define-public gatk
  (package
    (name "gatk")
    (version "4.4.0.0")
    (source (origin
             (method url-fetch)
             (uri 
           (string-append "https://github.com/broadinstitute/gatk/releases/download/" version "/gatk-" version ".zip"))
             (sha256
              (base32
               "1453l72cfwqw5y3na0d520czzq11x46d9dq66q5ssilbngvh0ij4"))))
    (build-system gnu-build-system)
    (arguments
    `(#:tests? #f ; This is a binary package only, so no tests.
      #:phases
      (modify-phases %standard-phases
        (delete 'configure) ; Nothing to configure.
        (delete 'build) ; This is a binary package only.
        (replace 'install
          (lambda _
            (let ((out (string-append (assoc-ref %outputs "out")
                                      "/share/java/" ,name "/")))
              (mkdir-p out)
          (copy-recursively (assoc-ref %build-inputs "source")
                (string-append out "/"))))))))
    (native-inputs
      (list
    unzip))
    (propagated-inputs
     (list
       r-gsalib
       r-ggplot2
       r-gplots
       r-reshape
       r-optparse
       r-dnacopy
       r-naturalsort
       r-dplyr
       r-data-table
       r-hmm
       ))
    (home-page "https://www.broadinstitute.org/gatk/")
    (synopsis "Package for analysis of high-throughput sequencing")
    (description "The Genome Analysis Toolkit or GATK is a software package for
analysis of high-throughput sequencing data, developed by the Data Science and
Data Engineering group at the Broad Institute.  The toolkit offers a wide
variety of tools, with a primary focus on variant discovery and genotyping as
well as strong emphasis on data quality assurance.  Its robust architecture,
powerful processing engine and high-performance computing features make it
capable of taking on projects of any size.")
    ;; There are additional restrictions that make it nonfree.
    (license license:expat)))

I have modeled this after the GATK 3.8 package which previously existed in guix-science-nonfree. I think that all I need to do in the install step is to copy the entire unarchived directory to the output directory, but I am receiving the following error:

error: in phase 'install': uncaught exception:
system-error "copy-file" "~A" ("Is a directory") (21) 
phase `install' failed after 0.0 seconds

I had thought copy-recursively should work here, but perhaps I am mistaken.

Also, I would be happy to submit a merge request for this package once it is working, if that would be desired.

rekado commented 1 year ago

GATK 4 is free software AFAIK so it shouldn't be added to the -nonfree channel.

I suggest using G-expressions. You also don't need to append a slash to the output.

stkgo commented 1 year ago

@rekado I had received the same error without appending the slash so I thought perhaps Guix may have been doing something odd with path handling. I had thought GATK was free software but wasn't positive since I had seen previous versions were non free. I can re-open this issue in the guix-science repo if you would prefer.