wbazant / CORRAL

MIT License
10 stars 5 forks source link

bsub and input format for local fastq files #3

Closed ChanyeongKim closed 2 years ago

ChanyeongKim commented 2 years ago

Hi, Thanks for providing nice tools.

I installed all the requirements (bowtie, marker_alignments, samtools) and the executables are all in $PATH. I also linked the Eukdetect database properly.

Then, I ran the nextflow but encountered error like the below: Cannot run program "bsub" (in directory "/path/to/corral/work/e7/f694a00a749673c4554ac632bfee8f"): error=2, No such file or directory It seems that the pipeline tried to submit a job to LSF queuing system. (right?) But my system does not use it. Then, can't I use it?

Also, I want to use it for the fastq files that I have already downloaded. (meaning not using wget) It seems that --downloadMethod "local" support this, is it right? And what should be the input.tsv format in this case?

Thanks, Chanyeong Kim.

wbazant commented 2 years ago

Hi, thanks for raising the issue and following my instructions in the README! I will update them in a bit to make them a bit better.

If you want to run the pipeline locally, skip -c $DIR/cluster.conf . Maybe replace it with -c $DIR/local.conf and use a following local.conf:

process {
  maxForks = 3

  withLabel: 'align' {
    errorStrategy = 'finish'
  }
}

The syntax for in.tsv should be two columns sampleId + readsPath for single end, or sampleId + readsPathForward + readsPathReverse for paired end.

wbazant commented 2 years ago

@ChanyeongKim how did it go? Were you able to run something locally?

ChanyeongKim commented 2 years ago

Sorry for the late response. Yeah, for that part it works now. Thanks for your help!

But I encountered another problem at summarize alignments step. below is the error message

[46/35fbd4] process > bowtie2Paired (1)       [100%] 1 of 1 ✔
executor >  local (3)
[46/35fbd4] process > bowtie2Paired (1)       [100%] 1 of 1 ✔
[3d/6e5916] process > alignmentStats (1)      [100%] 1 of 1 ✔
[ca/180880] process > summarizeAlignments (1) [100%] 1 of 1, failed: 1 ✘
[-        ] process > makeTsv                 -
Error executing process > 'summarizeAlignments (1)'

Caused by:
  Process `summarizeAlignments (1)` terminated with an error exit status (2)

Command executed:

  marker_alignments --min-read-query-length 60 --min-taxon-num-markers 2 --min-taxon-num-reads 2 --min-taxon-better-marker-cluster-averages-ratio 1.01 --threshold-avg-match-identity-to-call-known-taxon 0.97  --threshold-num-taxa-to-call-unknown-taxon 1 --threshold-num-markers-to-call-unknown-taxon 4     --threshold-num-reads-to-call-unknown-taxon 8     --input alignmentsPaired.sam     --refdb-marker-to-taxon-path      --refdb-format eukprot     --output-type taxon_all     --num-reads $(cat numReads.txt)     --output [sample].taxa.tsv

Command exit status:
  2

Command output:
  (empty)

Command error:
  usage: marker_alignments [-h] --input INPUT_ALIGNMENT_FILE [--sqlite-db-path SQLITE_DB_PATH] [--refdb-format REFDB_FORMAT] [--refdb-regex-taxon REFDB_REGEX_TAXON] [--refdb-regex-marker REFDB_REGEX_MARKER]
                           [--refdb-marker-to-taxon-path REFDB_MARKER_TO_TAXON_PATH] [--num-reads NUM_READS] [--output-type OUTPUT_TYPE] --output OUTPUT_PATH [--min-read-mapq MIN_READ_MAPQ] [--min-read-query-length MIN_READ_QUERY_LENGTH]
                           [--min-read-match-identity MIN_READ_MATCH_IDENTITY] [--min-taxon-num-markers MIN_TAXON_NUM_MARKERS] [--min-taxon-num-reads MIN_TAXON_NUM_READS] [--min-taxon-num-alignments MIN_TAXON_NUM_ALIGNMENTS]
                           [--min-taxon-fraction-primary-matches MIN_TAXON_FRACTION_PRIMARY_MATCHES] [--min-taxon-better-marker-cluster-averages-ratio MIN_TAXON_BETTER_CLUSTER_AVERAGES_RATIO]
                           [--threshold-avg-match-identity-to-call-known-taxon THRESHOLD_IDENTITY_TO_CALL_TAXON] [--threshold-num-reads-to-call-unknown-taxon THRESHOLD_NUM_READS_TO_CALL_UNKNOWN_TAXON]
                           [--threshold-num-markers-to-call-unknown-taxon THRESHOLD_NUM_MARKERS_TO_CALL_UNKNOWN_TAXON] [--threshold-num-taxa-to-call-unknown-taxon THRESHOLD_NUM_TAXA_TO_CALL_UNKNOWN_TAXON]
  marker_alignments: error: argument --refdb-marker-to-taxon-path: expected one argument

Work dir:
  /path/to/corral/work/ca/180880f058764ccab88e16db25f2ce

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

The error occurred basically because --refdb-marker-to-taxon-path is empty. (according to Command executed, and Command error)

I think it is relevant to --marker_to_taxon_path parameter in the run.sh file. But in my run.sh file, the parameter was correct (like below) and I double checked the existence and intactness of the file. --marker_to_taxon_path path/to/eukdetect/db/busco_taxid_link.txt

Can you check this part? Thanks a lot.

wbazant commented 2 years ago

@ChanyeongKim This was my mistake, I regret causing the error and I'm glad I could learn about it from you. I originally called the parameter --marker_to_taxon_path, later changed it to camel case for consistency in the pipeline, and didn't update the README until now.

in your run.sh, could you refer to the file with --markerToTaxonPath?

ChanyeongKim commented 2 years ago

Thanks, I changed to --markerToTaxonPath, and it works now.

After that I have this error. Can't locate List/MoreUtils.pm in @INC (you may need to install the List::MoreUtils module) So, I installed List::MoreUtils and solved. (just FYI)

Now I got the final result.

wbazant commented 2 years ago

Removed List::MoreUtils dependency in #ffe0ee02.