Closed olgabot closed 5 years ago
I think got this working, but I had to do some weird filesystem stuff to make the eval
actually evaluate.
func Index(fasta_gz file, id string) = {
// Create a bwt index of the fasta
d := dirs.Make([id + ".fasta.gz": fasta_gz])
// BWA accepts gzipped files as input
outdir := exec(image := bwa, cpu := 1) (outdir dir) {"
cd {{d}}
bwa index {{id}}.fasta.gz
ls -lha {{d}}
mv {{d}} {{outdir}}
"}
// NOTE: outdir is not being used anywhere, so it wont get evaluated
outdir := trace(outdir)
d := trace(d)
// val (index, _) = dirs.Pick(d, "*.bwt")
// to force outdir evaluation, remove above line and uncomment below.
// val (index, _) = outdir ~> dirs.Pick(d, "*.bwt")
outdir
}
But ... outdir
still only had the fasta in it, not any of the index files:
../../reflow/bwa.rf:24:20(bwa.Index.outdir): dir("0/dovetail_2018-05-06.fasta": file(sha256=sha256:3a72c5bb77cb0a133ce77b8055d79878c040bf34de2c7b4f513592d8070ac0ac, size=464355639))
Full output:
Plus this doesn't seem like best practices to me ... is there a better way?
I also don't see any of the stdout/stderr normally produced by the bwa index
command, even in the execlog. Is there a way to find it?
You probably want to run the command in outdir
. e.g., from the example 1000align:
// g1kv37Indexed is the BWA-indexed version of g1kv37.
val g1kv37Indexed = exec(image := "biocontainers/bwa", mem := GiB, cpu := 1) (out dir) {"
# Ignore failures here. The file from 1000genomes has a trailer
# that isn't recognized by gunzip. (This is not recommended practice!)
gunzip -c {{g1kv37}} > {{out}}/g1k_v37.fa || true
cd {{out}}
bwa index -a bwtsw g1k_v37.fa
"}
I also don't see any of the stdout/stderr normally produced by the bwa index command, even in the execlog. Is there a way to find it?
This is persisted but isn't currently easily accessible via tooling. @prasadgopal is very close to landing a change that makes reflow logs
(as well as reflow info
) work for non-running tasks as well.
If you have the flow id (the exec sha256), then you can look this up in the dynamodb table that stores the assoc (you can get that from your config, e.g.: assoc: dynamodb,reflow-cache-test
).
The entry for the sha256 will contain a "Logs" section. This is the sha256 of the stdout and stderr of the process.
Once you have this, you can examine the logs with reflow cat <sha256 of log>
.
Again, the tooling around this will massively improve once @prasadgopal lands his change. The goal is to have reflow info
, reflow logs
, reflow cat
, etc. just work for all object types. So if you see an identifier anywhere in reflow, you should be able to query it with reflow info
, etc.
Like many bioinformatics packages,
bwa index
relies on creating files based on the input filenames. Here's the stdout for runningbwa index
locally:After running
bwa index
locally, there's quite a few files that get output:Here's what I'm trying to do to get
bwa index
to work:But when I inspect either
d
oroutdir
, all they have is thefasta
!! Where did the output forbwa index
go??