robertaboukhalil / ginkgo

Cloud-based single-cell copy-number variation analysis tool
qb.cshl.edu/ginkgo
BSD 2-Clause "Simplified" License
47 stars 28 forks source link

Command line GinkGo #30

Closed pawelqs closed 4 years ago

pawelqs commented 4 years ago

Hi!

I would like to integrate GinkGo into my workflow. I see that I can install GinkGo on my computer but can I run it from the command line? Or does it supply only the graphical interface? Or is there maybe any API to GinkGo?

Rergards, Paweł

pawelqs commented 4 years ago

Ok, I have found CLI script ;) Information about it could be added in readme :)

pawelqs commented 4 years ago

Still having a problems with running command line GinkGo.

First, what I noticed:

  1. In ginkgo/cli/ginkgo.sh --input path/to/bed/files --genome hg19 --binning variable_500000_101_bowtie path/to/bed/files can not be ralative. Otherwise, process.R script fails on setwd(user_dir) (line 67)

  2. last line of ginkgo.sh must be changed into: tar -czf ${DIR_INPUT}/archive.tar.gz --exclude '*.bed' --exclude '*.bed.gz' --exclude '*.tar.gz' -C ${DIR_INPUT} ${DIR_INPUT}/*, otherwise script fails.

The problem that I have now is that in the output files samples are not identified by filenames, but by numbers. How can I know which number refers to which sample? I have found out that temporary {filename}_mapped files created in line 137 ginkgo.sh script have headers like this: /home/pkus/ginkgo_test/temp. I cannot guess how to make it using proper filenames.

pawelqs commented 4 years ago

Got it!

ginkgo.sh, line 136: ${DIR_SCRIPTS}/binUnsorted ${DIR_GENOME}/${BINNING} ${NB_BINS} <(${Z}cat ${file})echo ${file} | awk -F ".bed" '{print $1}'${file}_mapped awk gets a string of concatenated filenames and cuts it on '.bed' extension. Unfortunatelly, I used 'bed' keyword in my path and the names became broken.

I am not a bash expert so I will not write the solution, but reading 'list' file with line endings and treating them as separators, followed by removing the .bed / .bed.gz extensions would result in desirable result without any error ;)

Regards, Paweł

pawelqs commented 4 years ago

I also found that GinkGo fails if diploid reference sample is given in a compressed file.

Best, Paweł