bergmanlab / mcclintock

Meta-pipeline to identify transposable element insertions using next generation sequencing data
93 stars 30 forks source link

How to run mcclintock with a bash script? #100

Closed dgodin19 closed 1 year ago

dgodin19 commented 1 year ago

Hi,

I am trying to run mcclintock on a server, and will need a bash script to do so.

So far, I have: `#! /path/to/interpreter conda activate mcclintock python3 /path/to/mcclintock.py -r test/sacCer2.fasta -c test/sac_cer_TE_seqs.fasta -g test/reference_TE_locations.gff -t test/sac_cer_te_families.tsv -1 test/SRR800842_1.fastq.gz -2 test/SRR800842_2.fastq.gz -p 4

$ -o /path/output/directory`

But I don't think this is right.

cbergman commented 1 year ago

Hi @dgodin19

If you are running the job interactively on a server, the following pseudocode should work:

#!/bin/bash
conda activate mcclintock
python3 /path/to/mcclintock_repo/mcclintock.py 
   -r /path/to/mcclintock_repo/test/sacCer2.fasta 
   -c /path/to/mcclintock_repo/test/sac_cer_TE_seqs.fasta 
   -g /path/to/mcclintock_repo/test/reference_TE_locations.gff 
   -t /path/to/mcclintock_repo/test/sac_cer_te_families.tsv 
   -1 /path/to/mcclintock_repo/test/SRR800842_1.fastq.gz 
   -2 /path/to/mcclintock_repo/test/SRR800842_2.fastq.gz 
   -p 4 # you can change this to however many cores are available
   -o /path/output/directory

If you are running the job by submitting it to a cluster, you may need to change conda activate to source activate.

Hope this helps, Casey

dgodin19 commented 1 year ago

Hi @cbergman

Thanks for helping me out. I get the following error with the pseduocode:

usage: McClintock [-h] -r REFERENCE -c CONSENSUS -1 FIRST [-2 SECOND]
                  [-p PROC] [-o OUT] [-m METHODS] [-g LOCATIONS] [-t TAXONOMY]
                  [-s COVERAGE_FASTA] [-T] [-a AUGMENT]
                  [--sample_name SAMPLE_NAME] [--resume] [--install] [--debug]
                  [--slow] [--make_annotations] [-k KEEP_INTERMEDIATE]
                  [--config CONFIG]
McClintock: error: the following arguments are required: -r/--reference, -c/--consensus, -1/--first
/var/spool/uge/sage003/job_scripts/290912197: line 4: -r: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 5: -c: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 6: -g: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 7: -t: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 8: -1: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 9: -2: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 10: -p: command not found
/var/spool/uge/sage003/job_scripts/290912197: line 11: -o: command not found
dgodin19 commented 1 year ago

I figured it out. The commands all have to be a part of one line. For anybody else who stumbles upon this, it is:

!/bin/bash

source activate mcclintock python3 /path/to/repository/mcclintock.py -r /path/to/test/data/mcclintock/test/sacCer2.fasta -c /path/to/test/data/mcclintock/test/sac_cer_TE_seqs.fasta -g /path/to/test/data/mcclintock/test/reference_TE_locations.gff -t /path/to/test/data/mcclintock/test/sac_cer_te_families.tsv -1 /path/to/test/data/mcclintock/test/SRR800842_1.fastq.gz -2 /path/to/test/data/mcclintock/test/SRR800842_2.fastq.gz -p 1 -o /path/to/output/directory