ucscGenomeBrowser / kent

UCSC Genome Browser source tree. Stable branch: "beta".
http://genome.ucsc.edu/
Other
219 stars 89 forks source link

path issue for doBlastzChainNet.pl #63

Closed wthomas14 closed 2 years ago

wthomas14 commented 2 years ago

Sorry for posting for what might be a relatively easy path issue that I have yet to be able to fix. I have been running through the example here http://genomewiki.ucsc.edu/index.php/DoBlastzChainNet.pl#PATH_setup , to make sure everything is working before using my genomes of interest.

All the steps preceding running the actual script work, except that I am running this on my university HPCC, so all of my scripts, bins, and genomes are in a local directory, not the root directory.

When I run doBlastzChainNet.pl DEF -verbose=10 -noDbNameCheck -workhorse=localhost -bigClusterHub=localhost -skipDownload -dbHost=localhost -smallClusterHub=localhost -trackHub -fileServer=localhost -syntenicNet I get:

DEF looks OK!
    tDb=dm6
    qDb=GCF_000005575.2_AgamP3
    s1d=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.2bit
    isSelf=
bash: hgsql: command not found
bash: hgsql: command not found
HgStepManager: executing from step 'partition' through step 'syntenicNet'.
HgStepManager: executing step 'partition' Tue Nov 23 17:24:20 2021.
# chmod a+x /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz/doPartition.bash
# ssh -x -o 'StrictHostKeyChecking = no' -o 'BatchMode = yes' localhost nice /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz/doPartition.bash
+ cd /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz
+ /gpfs/scratch/withomas/project_noRoot_MGA/data/scripts/partitionSequence.pl 32100000 10000 /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.2bit /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.chrom.sizes -xdir xdir.sh -rawDir ../psl 18 -lstDir tParts
lstDir tParts must be empty, but seems to have files  (part062.lst ...)
Command failed:
ssh -x -o 'StrictHostKeyChecking = no' -o 'BatchMode = yes' localhost nice /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz/doPartition.bash

So my first question would be, why is my hgsql command not being found? I have the bin and scripts exported as a path in my bashrc export PATH=/usr/bin:/usr/sbin:/gpfs/scratch/withomas/project_noRoot_MGA/data/bin:/gpfs/scratch/withomas/project_noRoot_MGA/data/scripts:$PATH and I am able to use it outside of the script which hgsql /gpfs/scratch/withomas/project_noRoot_MGA/data/bin/hgsql and the path is set in my DEF file

# dm6 vs GCF_000005575.2_AgamP3
PATH=/gpfs/scratch/withomas/project_noRoot_MGA/data/scripts:/gpfs/scratch/withomas/project_noRoot_MGA/data/bin
BLASTZ=/gpfs/scratch/withomas/project_noRoot_MGA/data/bin/lastz-1.04.00
BLASTZ_H=2000
BLASTZ_Y=3400
BLASTZ_L=4000
BLASTZ_K=2200
BLASTZ_Q=/gpfs/scratch/withomas/project_noRoot_MGA/data/lastz/HoxD55.q

# TARGET: D. melanogaster dm6
SEQ1_DIR=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.2bit
SEQ1_LEN=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.chrom.sizes
SEQ1_CHUNK=32100000
SEQ1_LAP=10000
SEQ1_LIMIT=18

# QUERY: GCF_000005575.2_AgamP3
SEQ2_DIR=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/GCF_000005575.2_AgamP3.2bit
SEQ2_LEN=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/GCF_000005575.2_AgamP3.chrom.sizes
SEQ2_CHUNK=1000000
SEQ2_LIMIT=2000
SEQ2_LAP=0

BASE=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3
TMPDIR=/gpfs/scratch/withomas/project_noRoot_MGA/dev/shm

but still no luck.

I've attempted to play around with some of the function in doBlastzChainNet.pl such as loadDef and requirePath , but still haven't been able to figure it out. Any help would be greatly appreciated!

Noting: perhaps it isn't even a path issue, if HgStepManager seems to be working fine?

genome-www commented 2 years ago

Dear wthomas14,

This question is likely best sent to our mailing list at @., which is a public archived mailing list. We also have a private internal list at @. for questions that have sensitive data.

You can search our public list for similar archived historical questions: http://genome.ucsc.edu/contacts.html

For instance, here is a search for all conversations that mention the doBlastzChainNet.pl script: https://groups.google.com/a/soe.ucsc.edu/g/genome/search?q=doBlastzChainNet.pl

All the best,

On Tue, Nov 23, 2021 at 2:27 PM 'wthomas14' via UCSC Genome Browser Confidential Support @.***> wrote:

Sorry for posting for what might be a relatively easy path issue that I have yet to be able to fix. I have been running through the example here http://genomewiki.ucsc.edu/index.php/DoBlastzChainNet.pl#PATH_setup , to make sure everything is working before using my genomes of interest.

All the steps preceding running the actual script work, except that I am running this on my university HPCC, so all of my scripts, bins, and genomes are in a local directory, not the root directory.

When I run doBlastzChainNet.pl DEF -verbose=10 -noDbNameCheck -workhorse=localhost -bigClusterHub=localhost -skipDownload -dbHost=localhost -smallClusterHub=localhost -trackHub -fileServer=localhost -syntenicNet I get:

DEF looks OK! tDb=dm6 qDb=GCF_000005575.2_AgamP3 s1d=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.2bit isSelf= bash: hgsql: command not found bash: hgsql: command not found HgStepManager: executing from step 'partition' through step 'syntenicNet'. HgStepManager: executing step 'partition' Tue Nov 23 17:24:20 2021.

chmod a+x /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz/doPartition.bash

ssh -x -o 'StrictHostKeyChecking = no' -o 'BatchMode = yes' localhost nice /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz/doPartition.bash

  • cd /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz
  • /gpfs/scratch/withomas/project_noRoot_MGA/data/scripts/partitionSequence.pl 32100000 10000 /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.2bit /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.chrom.sizes -xdir xdir.sh -rawDir ../psl 18 -lstDir tParts lstDir tParts must be empty, but seems to have files (part062.lst ...) Command failed: ssh -x -o 'StrictHostKeyChecking = no' -o 'BatchMode = yes' localhost nice /gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/run.blastz/doPartition.bash

So my first question would be, why is my hgsql command not being found? I have the bin and scripts exported as a path in my bashrc export PATH=/usr/bin:/usr/sbin:/gpfs/scratch/withomas/project_noRoot_MGA/data/bin:/gpfs/scratch/withomas/project_noRoot_MGA/data/scripts:$PATH and I am able to use it outside of the script which hgsql /gpfs/scratch/withomas/project_noRoot_MGA/data/bin/hgsql and the path is set in my DEF file

dm6 vs GCF_000005575.2_AgamP3

PATH=/gpfs/scratch/withomas/project_noRoot_MGA/data/scripts:/gpfs/scratch/withomas/project_noRoot_MGA/data/bin BLASTZ=/gpfs/scratch/withomas/project_noRoot_MGA/data/bin/lastz-1.04.00 BLASTZ_H=2000 BLASTZ_Y=3400 BLASTZ_L=4000 BLASTZ_K=2200 BLASTZ_Q=/gpfs/scratch/withomas/project_noRoot_MGA/data/lastz/HoxD55.q

TARGET: D. melanogaster dm6

SEQ1_DIR=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.2bit SEQ1_LEN=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/dm6.chrom.sizes SEQ1_CHUNK=32100000 SEQ1_LAP=10000 SEQ1_LIMIT=18

QUERY: GCF_000005575.2_AgamP3

SEQ2_DIR=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/GCF_000005575.2_AgamP3.2bit SEQ2_LEN=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3/GCF_000005575.2_AgamP3.chrom.sizes SEQ2_CHUNK=1000000 SEQ2_LIMIT=2000 SEQ2_LAP=0

BASE=/gpfs/scratch/withomas/project_noRoot_MGA/data/genomes/dm6/trackData/GCF_000005575.2_AgamP3 TMPDIR=/gpfs/scratch/withomas/project_noRoot_MGA/dev/shm

but still no luck.

I've attempted to play around with some of the function in doBlastzChainNet.pl such as loadDef and requirePath , but still haven't been able to figure it out. Any help would be greatly appreciated!

Noting: perhaps it isn't even a path issue, if HgStepManager seems to be working fine?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ucscGenomeBrowser/kent/issues/63, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQIURECN3KEGWI2TUSPXWDDUNQIMBANCNFSM5IUTOPKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AngieHinrichs commented 2 years ago

@wthomas14 it looks like github helpfully sanitized the email addresses out of my colleague's reply there. That should read:

This question is likely best sent to our mailing list at genome@soe.ucsc.edu, which is a public archived mailing list. We also have a private internal list at genome-www@soe.ucsc.edu for questions that have sensitive data.