Illumina / canvas

Canvas - Copy number variant (CNV) calling from DNA sequencing data
Other
121 stars 20 forks source link

Inconsistent library release #110

Open bioinfornatics opened 5 years ago

bioinfornatics commented 5 years ago

Dear,

Currently the way to provides each canvas release imply to have multiple copy of the same library such as Isas.Framework.dll, same executable such as tabix.

This is error prone and do not follow standard packaging system see the FHS

Why is error prone ? using your release archive we can see that for a same library different version are used !

$ find Canvas-1.39.0.1598+master_x64/ -name 'Isas.Framework.dll' | xargs sha1sum
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasBin/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasClean/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasDiploidCaller/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasNormalize/Isas.Framework.dll
15bbd898f28b02ac72de2a53a50a2a9ad78e61d9  Canvas-1.39.0.1598+master_x64/CanvasPartition/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasPedigreeCaller/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasSNV/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasSmooth/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/CanvasSomaticCaller/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/Isas.Framework.dll
15bbd898f28b02ac72de2a53a50a2a9ad78e61d9  Canvas-1.39.0.1598+master_x64/Tools/EvaluateCNV/Isas.Framework.dll
2822eddeaa941a6a6999b562517c3a35dfa1807c  Canvas-1.39.0.1598+master_x64/Tools/FlagUniqueKmers/Isas.Framework.dll

Same for tabix

find  Canvas-1.39.0.1598+master_x64/ -name 'tabix' | xargs sha1sum
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasBin/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasClean/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasDiploidCaller/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasNormalize/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasPartition/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasPedigreeCaller/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasSNV/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasSmooth/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/CanvasSomaticCaller/tabix
b22ffa6d00275110f0e4fa02e08eee8866547874  Canvas-1.39.0.1598+master_x64/Tools/EvaluateCNV/tabix
3d40aa36f63eb34802d5dcd4fd985d47cd2b0ed0  Canvas-1.39.0.1598+master_x64/tabix

And so on ...

You should provide a standard compliant archive like this:

.
├── bin
│   ├── canvas
│   ├── EvaluateCNV
│   └── FlagUniqueKmers
├── installation.log.xz
├── lib
│   ├── Accord.dll
│   ├── Accord.dll.config
│   ├── Accord.MachineLearning.dll
│   ├── Accord.Math.Core.dll
│   ├── Accord.Math.dll
│   ├── Accord.Statistics.dll
│   ├── BamMetricsWrapper.dll
│   ├── CanvasBin.deps.json
│   ├── CanvasBin.dll
│   ├── CanvasBin.dll.config
│   ├── CanvasBin.pdb
│   ├── CanvasBin.runtimeconfig.dev.json
│   ├── CanvasBin.runtimeconfig.json
│   ├── CanvasClean.deps.json
│   ├── CanvasClean.dll
│   ├── CanvasClean.dll.config
│   ├── CanvasClean.pdb
│   ├── CanvasClean.runtimeconfig.dev.json
│   ├── CanvasClean.runtimeconfig.json
│   ├── CanvasCommon.dll
│   ├── CanvasCommon.dll.config
│   ├── CanvasCommon.pdb
│   ├── Canvas.deps.json
│   ├── CanvasDiploidCaller.deps.json
│   ├── CanvasDiploidCaller.dll
│   ├── CanvasDiploidCaller.dll.config
│   ├── CanvasDiploidCaller.pdb
│   ├── CanvasDiploidCaller.runtimeconfig.dev.json
│   ├── CanvasDiploidCaller.runtimeconfig.json
│   ├── Canvas.dll
│   ├── Canvas.dll.config
│   ├── CanvasNormalize.deps.json
│   ├── CanvasNormalize.dll
│   ├── CanvasNormalize.dll.config
│   ├── CanvasNormalize.pdb
│   ├── CanvasNormalize.runtimeconfig.dev.json
│   ├── CanvasNormalize.runtimeconfig.json
│   ├── CanvasPartition.deps.json
│   ├── CanvasPartition.dll
│   ├── CanvasPartition.dll.config
│   ├── CanvasPartitionParameters.json
│   ├── CanvasPartition.pdb
│   ├── CanvasPartition.runtimeconfig.dev.json
│   ├── CanvasPartition.runtimeconfig.json
│   ├── Canvas.pdb
│   ├── CanvasPedigreeCaller.deps.json
│   ├── CanvasPedigreeCaller.dll
│   ├── CanvasPedigreeCaller.dll.config
│   ├── CanvasPedigreeCaller.pdb
│   ├── CanvasPedigreeCaller.runtimeconfig.dev.json
│   ├── CanvasPedigreeCaller.runtimeconfig.json
│   ├── Canvas.runtimeconfig.json
│   ├── CanvasSmooth.deps.json
│   ├── CanvasSmooth.dll
│   ├── CanvasSmooth.dll.config
│   ├── CanvasSmooth.pdb
│   ├── CanvasSmooth.runtimeconfig.dev.json
│   ├── CanvasSmooth.runtimeconfig.json
│   ├── CanvasSNV.deps.json
│   ├── CanvasSNV.dll
│   ├── CanvasSNV.dll.config
│   ├── CanvasSNV.pdb
│   ├── CanvasSNV.runtimeconfig.dev.json
│   ├── CanvasSNV.runtimeconfig.json
│   ├── CanvasSomaticCaller.deps.json
│   ├── CanvasSomaticCaller.dll
│   ├── CanvasSomaticCaller.dll.config
│   ├── CanvasSomaticCaller.pdb
│   ├── CanvasSomaticCaller.runtimeconfig.dev.json
│   ├── CanvasSomaticCaller.runtimeconfig.json
│   ├── ClassicBioinfoTools.dll
│   ├── Combinatorics.dll
│   ├── Combinatorics.pdb
│   ├── EvaluateCNV.deps.json
│   ├── EvaluateCNV.dll
│   ├── EvaluateCNV.dll.config
│   ├── EvaluateCNV.pdb
│   ├── EvaluateCNV.runtimeconfig.json
│   ├── FileCompression.dll
│   ├── FlagUniqueKmers.deps.json
│   ├── FlagUniqueKmers.dll
│   ├── FlagUniqueKmers.pdb
│   ├── FlagUniqueKmers.runtimeconfig.json
│   ├── Illumina.Common.dll
│   ├── Isas.Framework.dll
│   ├── Isas.Manifests.AmpliconManifest.dll
│   ├── Isas.Manifests.ForenSeqManifest.dll
│   ├── Isas.Manifests.NexteraManifest.dll
│   ├── Isas.Metrics.dll
│   ├── Isas.Ploidy.dll
│   ├── Isas.SequencingFiles.dll
│   ├── libFileCompression.so
│   ├── MathNet.Numerics.Core.dll
│   ├── Newtonsoft.Json.dll
│   ├── Nito.ArraySegments.dll
│   ├── PedigreeCallerParameters.json
│   ├── protobuf-net.dll
│   ├── QualityScoreParameters.json
│   ├── RnaAmpliconManifest.dll
│   └── SomaticCallerParameters.json
└── share
    └── doc
        └── COPYRIGHT.txt

To do this you could use this script in the future (once you have the same version see above):

DESTDIR='' # use for build stage see how a rpm or deb are builded
PREFIX=''  # final installation path could be ~/.local | /usr/local | /usr and so ...
INSTDIR="${DESTDIR}/${PREFIX}"
mkdir -p "${INSTDIR}"/{bin,lib,share}
find . -type f  \( -name 'tabix' -or -name 'tabix.exe' -or -name 'FlagUniqueKmers' -or -name 'EvaluateCNV' -or -name 'Canvas' -or -name 'bedGraphToBigWig'  \) -delete
find . \( -name '*.dll' -or -name '*.json' -or -name '*.pdb' -or -name '*.config' -or -name '*.so' \) | xargs -I{} install -m0644 {} "${INSTDIR}"/lib

echo '#!/usr/bin/env bash
declare -r scriptpath="$( readlink -f "${BASH_SOURCE[0]}" )"
declare -r bindir="$(dirname  "${scriptpath}")"
declare -r libdir=${bindir/bin/lib}
export COMPlus_gcAllowVeryLargeObjects=1
dotnet ${libdir}/Canvas.dll "$@"
' > "${INSTDIR}"/bin/canvas

echo '#!/usr/bin/env bash
declare -r scriptpath="$( readlink -f "${BASH_SOURCE[0]}" )"
declare -r bindir="$(dirname  "${scriptpath}")"
declare -r libdir=${bindir/bin/lib}
dotnet  ${libdir}/FlagUniqueKmers.dll "$@"
' > "${INSTDIR}"/bin/FlagUniqueKmers

echo '#!/usr/bin/env bash
declare -r scriptpath="$( readlink -f "${BASH_SOURCE[0]}" )"
declare -r bindir="$(dirname  "${scriptpath}")"
declare -r libdir=${bindir/bin/lib}
export COMPlus_gcAllowVeryLargeObjects=1
dotnet  ${libdir}/EvaluateCNV.dll.dll "$@"
' > "${INSTDIR}"/bin/EvaluateCNV

chmod 0755 "${INSTDIR}"/bin/{canvas,FlagUniqueKmers,EvaluateCNV}
find "${INSTDIR}" -name '*.so' | xargs chmod 0755
install -m0644 COPYRIGHT.txt "${INSTDIR}"/share

Thanks

eroller commented 5 years ago

This software runs on the .net core runtime so I'm not sure the same standard for native Linux distributions makes a lot of sense. The .net core runtime has specific rules for resolving library dependencies that don't follow the Linux conventions for .so files.

With our current release, each separate executable is free to depend on its own version of any library (or binary such as tabix). This does create file duplication, but also provides freedom to not upgrade dependencies for every subcomponent unnecessarily (reduced test burden).

We could have a separate installation script for each .net main entry pointy that creates the structure as you describe, but the primary Canvas entry point will need to know where each executable lives. Using Linux PATH to find each entry point is one option, but rather than requiring additional configuration via PATH modification, the paths are implicitly defined by the filesystem hierarchy.