kbaseattic / assembly

An extensible framework for genome assembly.
MIT License
12 stars 14 forks source link

Store arbitrary plugin output, plugin manager bugfix #295

Closed cbun closed 9 years ago

cbun commented 9 years ago

Keeps track of any output provided by a plugin. This can be used with tools such as FastQC to retrieve analysis data:

$ curl localhost:8000/user/cbun/job/1007/data | python -mjson.tool
...
{
                "module": "fastqc",
                "module_output": {
                    "file_stats": {
                        "0": {
                            "Basic Statistics": "pass",
                            "Kmer Content": "warn",
                            "Overrepresented sequences": "warn",
                            "Per base GC content": "fail",
                            "Per base N content": "pass",
                            "Per base sequence content": "warn",
                            "Per base sequence quality": "fail",
                            "Per sequence GC content": "pass",
                            "Per sequence quality scores": "pass",
                            "Sequence Duplication Levels": "pass",
                            "Sequence Length Distribution": "pass"
                        },
                        "1": {
                            "Basic Statistics": "pass",
                            "Kmer Content": "warn",
                            "Overrepresented sequences": "warn",
                            "Per base GC content": "fail",
                            "Per base N content": "pass",
                            "Per base sequence content": "warn",
                            "Per base sequence quality": "pass",
                            "Per sequence GC content": "pass",
                            "Per sequence quality scores": "pass",
                            "Sequence Duplication Levels": "pass",
                            "Sequence Length Distribution": "pass"
                        }
                    },
                    "input_data": [
                        "/mnt/data/cbun/235/raw/p1.fq",
                        "/mnt/data/cbun/235/raw/p2.fq"
                    ],
                    "report": [
                        "/mnt/data/cbun/235/1007/fastqc_56faf72f-4092-4e5b-b680-f51c1e93d97b/p2.fq_fastqc/p2.fq_fastqc.txt",
                        "/mnt/data/cbun/235/1007/fastqc_56faf72f-4092-4e5b-b680-f51c1e93d97b/p1.fq_fastqc/p1.fq_fastqc.txt"
                    ]
                }
...
levinas commented 9 years ago

@cbun Chris, how do we access the fastqc text files?

cbun commented 9 years ago

@levinas this route should show any available files:

curl localhost:8000/user/cbun/job/1007/shock_node

{
    "1.fastqc_report_1.txt": "6501bb03-104f-4cc1-915f-db2b60eec28b",
    "1.fastqc_report_2.txt": "ed59a395-7b1e-410f-b25e-4ab39112a5d7",
    "1007_report.txt": "58c69501-7de1-417b-9317-9b4c2390bea0"
}

Now, getting FastQC files is dependent on the recipe as well.

Wasp Upload

Wasp will upload any default output on the outermost expression. An example of a default: kiki.py specifies OUTPUT = contigs and returns a dictionary {"contigs": ["contigs.fa"]}

(fastqc READS)
;; => fastqc.txt

And also any children of a (begin ...) expresions:

(begin 
  (fastqc READS) 
  (kiki READS))
;; => fastqc.txt, kiki.fa

And furthermore, can be explicitly called for upload:

(sspace (upload (kiki READS)))
;; => sspace.fa, kiki.fa