nf-core / kmermaid

k-mer similarity analysis pipeline
https://nf-co.re/kmermaid
MIT License
19 stars 12 forks source link

added extract coding, tested on travis, added docker.tmp="auto" #40

Closed pranathivemuri closed 4 years ago

pranathivemuri commented 4 years ago

PR checklist

Learn more about contributing: https://github.com/nf-core/kmer-similarity/tree/master/.github/CONTRIBUTING.md

pranathivemuri commented 4 years ago

@olgabot replied to your comments, please review again when you are free. I want to merge into dev sooner than later. later implies more conflicts to resolve and not being able to get this into the next release.

pranathivemuri commented 4 years ago

there is a extract_coding test failing online same test passes locally

(nextflow) ➜ kmermaid git:(olgabot/khtools-extract-coding) nextflow run main.nf -profile test_extract_coding N E X T F L O W ~ version 19.07.0 Launching main.nf [wise_sax] - revision: 97ac92c15c WARN: There's no process matching config selector: sourmash_compute_sketch_fast [2m---------------------------------------------------- ,--./,-. _ _ /,-..--~' |\ | | / ` / \ |) | } { | | | _, __/ | \ |__ `-.,--, .,.,' nf-core/kmermaid v1.0.0dev

Run Name : wise_sax BAM : https://github.com/nf-core/test-datasets/raw/kmermaid/testdata/10x-example/possorted_genome_bam.bam K-mer sizes : 3,9 Molecule : dna,protein,dayhoff Log2 Sketch Sizes : 2,4 One Sig per Record: true Track Abundance : false Bam chunk line count: 350 Count valid reads : 10 Saved Fastas : fastas Barcode umi read metadata: metadata.csv Peptide fasta : https://github.com/czbiohub/test-datasets/raw/kmermaid/reference/gencode.v32.pc_translations.subsample5.randomseed0.fa Peptide ksize : 11 Peptide molecule : dayhoff Bloom filter table size: 1e8 Max Resources : 6 GB memory, 2 cpus, 2d time per job Output dir : ./results Launch dir : /Users/pranathivemuri/czbiohub/kmermaid Working dir : /Users/pranathivemuri/czbiohub/kmermaid/work Script dir : /Users/pranathivemuri/czbiohub/kmermaid User : pranathivemuri Config Profile : test_extract_coding Config Description: Minimal test dataset to check pipeline function [0m---------------------------------------------------- [- ] process > get_software_versions - executor > local (1) executor > local (2) executor > local (3) executor > local (3) executor > local (4) executor > local (4) executor > local (4) executor > local (4) [24/647993] process > get_software_versions [100%] 1 of 1 ✔ [aa/931af6] process > peptide_bloom_filter (genco... [100%] 1 of 1 ✔ [90/081d52] process > bam2fasta (bam2fasta) [100%] 1 of 1 ✔ [37/80887f] process > extract_coding (possorted_g... [100%] 1 of 1 ✔ [- ] process > sourmash_compute_sketch_fas... - [- ] process > sourmash_compute_sketch_fas... - [- ] process > sourmash_compare_sketches - [0;35mWarning, pipeline completed, but with errored process(es) [0;31mNumber of ignored errored process(es) : 0 [0;32mNumber of successfully ran process(es) : 4 [0;35m[nf-core/kmermaid] Pipeline completed successfully WARN: Task runtime metrics are not reported when using macOS without a container engine

the fasta file output is empty from extract_coding step though

pranathivemuri commented 4 years ago

I am testing the fasta instead with extract_coding. the 10x example bam and randomly subsampled 5 translations added produce an empty fastsa. let me know if this should have worked as well.

pranathivemuri commented 4 years ago

@olgabot the fasta file and extract coding worked. If we want to test a bam fule and extract coding, we need a smaller subset of bam file and peptide translations fasta for the sequences in the bam file.

pranathivemuri commented 4 years ago

@olgabot please review when you are free

pranathivemuri commented 4 years ago

@olgabot please review when you are free! added the test for extract coding on bam file

pranathivemuri commented 4 years ago

@olgabot please review when you are free!