sr320 / course-fish546-2018

7 stars 2 forks source link

Completing job on MOX #68

Closed calderatta closed 4 years ago

calderatta commented 5 years ago

I've been running a published custom perl script to parse reads into targeted genes. This uses blastn and is taking a while (almost a week so far), so I'm trying to get access to Hyak. Meanwhile, it would be great if I could get help to run the remainder of this job on MOX from someone with access. There's a directory in my repo with the script to run. All commands are local.

sr320 commented 5 years ago

@kubu4 do you think you could get this going on mox?

sr320 commented 5 years ago

admittedly am not readily finding the actual perl script- though it is stated it is in the zip file.

kubu4 commented 5 years ago

Yeah, we'll need more info on this...

kubu4 commented 5 years ago

Well, actually, I think I see what's happening. Give me a sec...

kubu4 commented 5 years ago

Looking at this, it will likely go faster if he adds the -num_threads n option to line 64 of the band.pl file. Currently, it's not set, so I think blast defaults to one thread.

Regardless, I'll get this running on Mox shortly.

kubu4 commented 5 years ago

Up and running here:

/gscratch/srlab/calder_exon_capture

sr320 commented 5 years ago

Thanks! On Nov 15, 2018, 3:00 PM -0800, kubu4 notifications@github.com, wrote:

Up and running here: /gscratch/srlab/calder_exon_capture — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub, or mute the thread.

calderatta commented 5 years ago

Thank you to both of you. Let me know if you need more info.

kubu4 commented 5 years ago

That blast job finished (received email, no errors). Haven't looked at SLURM file for any errors.

On Thu, Nov 15, 2018, 15:55 calderatta <notifications@github.com wrote:

Thank you to both of you. Let me know if you need more info.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/sr320/course-fish546-2018/issues/68#issuecomment-439233137, or mute the thread https://github.com/notifications/unsubscribe-auth/AEThOGKx849-d80FYglPyjEi7S0zfBF5ks5uvf7UgaJpZM4Ydcgq .

sr320 commented 5 years ago

Here is the resulting files http://gannet.fish.washington.edu/seashell/bu-mox/calder_exon_capture/

sr320 commented 5 years ago

And for reference - this is how @kubu4 ran it on mox.

#!/bin/bash
## Job Name
#SBATCH --job-name=calder_blast
## Allocation Definition 
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=30-0:00:10
## Memory per node
#SBATCH --mem=120
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/calder_exon_capture

# Add BLAST to system PATH
export PATH=$PATH:/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin

# Run Calder's script (modified - removed curl and unzip commands from original version)
bash main.sh
calderatta commented 5 years ago

While looking into the Trinity issue, the problem looks like it might be coming from this step. The band.pl file is out of date. Can I ask for help to run this again but with the updated version?

kubu4 commented 5 years ago

Submitted this to CoEnv node, but the job has been queued, as someone is currently using the two CoEnv nodes (they've been running for 3.5 days already, so hopefully they finish soon).

Set max runtime as two days, to try to avoid conflict with scheduled downtime on 12/11/2018. However, might be moot if that other job takes too long. Regardless, I'll post here when job starts and ends.

Working directory is set as:

/gscratch/srlab/calder_exon_capture/20181205

Here's SBATCH script used to run job:

#!/bin/bash
## Job Name
#SBATCH --job-name=calder_blast
## Allocation Definition 
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=02-0:00:10
## Memory per node
#SBATCH --mem=120
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/calder_exon_capture/20181205

# Add BLAST to system PATH
export PATH=$PATH:/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin

## All input files and scripts were downloaded and unzipped using following commands:
## curl http://eagle.fish.washington.edu/fish546/calder/preads/preads-data.zip > preads-data.zip
## unzip preads-data.zip

# Edit band.pl file to run blastn with 28 threads
# Uses sed pattern matching to find line with blastn command.
# Then uses sed substitute command to find -outfmt 6 and replace.
sed -i '/blastn/ s/-outfmt 6/-outfmt 6 -num_threads 28/' bandp.pl

# Run Calder's script (modified - removed curl and unzip commands from original version)
bash main.sh
kubu4 commented 5 years ago

Also, @calderatta, you should update your project zip file to include the updated Perl script. I didn't realize it hadn't been updated until just now - not sure what made me double check that...

Anyway, I replaced it with Commit 6f881a0 and it will use that version when the job actually runs.

kubu4 commented 5 years ago

@calderatta Job just failed because there wasn't a file called main.sh to run. Didn't realize it, as I just downloaded and unzipped that previous file. Where does main.sh live so I can run it?

kubu4 commented 5 years ago

Never mind, found it. However, job's failing to execute because there's a Perl module that the script requires that we don't have installed (line 28 of the script - Parallel ForkManager). I'll see if I can get it installed.

kubu4 commented 5 years ago

Ok, got that module installed and the job is running!

calderatta commented 5 years ago

Thanks Sam. Looks like that module was added in the updated version, but glad it's running.

kubu4 commented 5 years ago

Job finished with the proper exit code, but a glance at the slum output file makes me wonder if the job ran properly.

image