Closed lnyawen closed 1 year ago
Hello, Yes, our development was done all on SLURM, but we are ready to add options for other clusters based on user demand. So we will be actively developing more options specifically for PBS based on your request, probably based on this profile. We'll work on this to get it implemented as quickly as possible, but I'll need to figure out a way to test the PBS profile since our cluster is SLURM-based.
Hi
I'm glad to help test the PBS profile if you'd like. And how should I do it?
Yawen
Thanks, that's a nice offer! I'll let you know here how we proceed. I'm actually going on vacation for the next 2 weeks though, so it is unlikely I'll be able to do anything until then, unfortunately.
OK, tell me what I should do when you get back. Have a good vacation!
Yawen
Hello authors,
I deploy profile of PBS follow this, and l successfully test the snakemake base on PBS profile with command snakemake -p -s ~/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/run_phyloacc.smk --configfile ~/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/phyloacc-config.yaml --profile pbs-torque --dryrun
. PhyloAcc-test-data comes from here.
However, after the batches have completed, I use phyloacc_post.py -i phyloacc-test
to gather the outputs. But I get an error:
--------------------------------------------------------
**Error OP5: Error reading tree from interface log file!
--------------------------------------------------------
How should I do to solve this error.
Thanks! Yawen
Hi Yawen, That's great that you got a PBS profile working for PhyloAcc! Would you be ok sharing it so we can try and work it in as an input option?
As for the error, I would need to see your interface log file to start to get an idea for what's happening. Can you copy it here if it isn't too large? Thanks!
Hello
Absolutely, I'm glad to share with you what I do . Firstly, I deployed profile of PBS with command
mkdir -p ~/.config/snakemake
cd ~/.config/snakemake
cookiecutter https://github.com/Snakemake-Profiles/pbs-torque.git
cd pbs-torque && chmod 755 pbs*
And then I performed the following command to create snakemake file,
phyloacc.py -a simu_500_200_diffr_2-1.fa -b simu_500_200_diffr_2-1.bed -i id-subset.txt -m ratite.mod -o phyloacc-test -t "strCam;rhePen;rheAme;casCas;droNov;aptRow;aptHaa;aptOwe;anoDid" -g "allMis;allSin;croPor;gavGan;chrPic;cheMyd;anoCar" -n 4 -batch 5 -j 2 -part "core28"
and geted resulting snakemake command that is printed to the screen
snakemake -p -s /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/run_phyloacc.smk --configfile /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/phyloacc-config.yaml --profile /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/profiles/slurm_profile --dryrun
then I replaced --profile /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/profiles/slurm_profile
with --profile pbs-torque
Finally, I got the message that it ran successfully.
[Fri Oct 14 09:45:53 2022]
Finished job 0.
3 of 3 steps (100%) done
Complete log: .snakemake/log/2022-10-14T093959.452945.snakemake.log
unlocking
removing lock
removing lock
removed all locks
And I check the result directory phyloacc-test/phyloacc-job-files/phyloacc-output
contain the result files.
These are all the commands I test PhyloAcc-test-data with PBS profile. Is there anything else I need to do?
And for the error of the phyloacc_post.py -i phyloacc-test
, I found only two log files in the phyloacc-test
directory, one is phyloacc-test.log
and the other is final-results.log
. Is interface log file you mentioned in them?
Thanks for your help!
So it just ran with the cookiecutter profile, that's great! That will be easy to incorporate.
For the logfile, I would need to see the phyloacc-test.log
.
Thanks!
Hello,
This is my phyloacc-test.log file, which is a little large.
[liunyw@mu01 phyloacc-test]$ cat phyloacc-test.log
# Welcome to PhyloAcc -- Bayesian rate analysis of conserved non-coding genomic elements.
# Version 2.0.0 released on April 1, 2022
# PhyloAcc was developed by Zhirui Hu, Han Yan, Gregg Thomas, Tim Sackton, Scott Edwards, and Jun Liu
# Citation: https://doi.org/10.1093/molbev/msz049
# Website: https://phyloacc.github.io
# Report issues: https://github.com/phyloacc/PhyloAcc
#
# The date and time at the start is: 10.14.2022 | 09:32:52
# Using Python version: 3.10.6
#
# The program was called as: /gpfs/home/liunyw/mambaforge-pypy3/envs/PhyloAcc/bin/phyloacc.py -a simu_500_200_diffr_2-1.fa -b simu_500_200_diffr_2-1.bed -i id-subset.txt -m ratite.mod -o phyloacc-test -t strCam;rhePen;rheAme;casCas;droNov;aptRow;aptHaa;aptOwe;anoDid -g allMis;allSin;croPor;gavGan;chrPic;cheMyd;anoCar -n 4 -batch 5 -j 2 -part core28
#
# -----------------------------------------------------------------------------------------------------------------------------
# INPUT/OUTPUT INFO:
# Alignment file: /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/simu_500_200_diffr_2-1.fa
# Bed file: /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/simu_500_200_diffr_2-1.bed
# Tree/rate file (mod file from PHAST): /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/ratite.mod
# Tree read from mod file: (((((((((((((((taeGut:0.0465637,ficAlb:0.0538332)taeGut-ficAlb:0.00653656,pseHum:0.0414039)taeGut-pseHum:0.0162337,corBra:0.0350559)taeGut-corBra:0.104721,melUnd:0.0935108)taeGut-melUnd:0.0152322,falPer:0.0676997)taeGut-falPer:0.00595262,((picPub:0.154108,lepDis:0.0567586)picPub-lepDis:0.00987136,halLeu:0.046153)picPub-halLeu:0.00237951)taeGut-picPub:0.00502294,(((aptFor:0.0110665,pygAde:0.0132217)aptFor-pygAde:0.0216787,fulGla:0.0326388)aptFor-fulGla:0.0034278,nipNip:0.0427518)aptFor-nipNip:0.00725913)taeGut-aptFor:0.00238832,(balReg:0.0519596,chaVoc:0.0560994)balReg-chaVoc:0.0048854)taeGut-balReg:0.00599725,((calAnn:0.0977611,chaPel:0.081066)calAnn-chaPel:0.0304959,cucCan:0.101256)calAnn-cucCan:0.00794796)taeGut-calAnn:0.00244451,(colLiv:0.0945655,mesUni:0.0851707)colLiv-mesUni:0.0127853)taeGut-colLiv:0.0304131,((galGal:0.0376982,melGal:0.0420019)galGal-melGal:0.0915582,anaPla:0.0856191)galGal-anaPla:0.0361731)taeGut-galGal:0.0405465,((((((aptHaa:0.00138798,aptOwe:0.00163359)aptHaa-aptOwe:0.00305011,aptRow:0.00410502)aptHaa-aptRow:0.0277314,(casCas:0.0115431,droNov:0.0137378)casCas-droNov:0.0273843)aptHaa-casCas:0.0028791,(rheAme:0.00469461,rhePen:0.00533595)rheAme-rhePen:0.0566016)aptHaa-rheAme:0.00185129,(((cryCin:0.0470774,tinGut:0.038861)cryCin-tinGut:0.0172047,(eudEle:0.0654903,notPer:0.0730502)eudEle-notPer:0.00799637)cryCin-eudEle:0.0671317,anoDid:0.0560433)cryCin-anoDid:0.0251786)aptHaa-cryCin:0.0118409,strCam:0.0513888)aptHaa-strCam:0.0406895)taeGut-aptHaa:0.169725,((allMis:0.00896896,allSin:0.00775865)allMis-allSin:0.0142506,(croPor:0.0178745,gavGan:0.0144863)croPor-gavGan:0.0116871)allMis-croPor:0.147354)taeGut-allMis:0.0317238,(chrPic:0.0287726,cheMyd:0.0316043)chrPic-cheMyd:0.0842993)taeGut-chrPic:0.248317,anoCar:0.248317)taeGut-anoCar;
# Output directory: phyloacc-test
# PhyloAcc run directory: phyloacc-test/phyloacc-job-files
# Log file: phyloacc-test.log
# -----------------------------------------------------------------------------------------------------------------------------
# DEPENDENCY PATHS:
# Program Specified Path
# PhyloAcc PhyloAcc-ST
# -----------------------------------------------------------------------------------------------------------------------------
# SPECIES GROUPS:
# Group Species
# Targets (-t) strCam;rhePen;rheAme;casCas;droNov;aptRow;aptHaa;aptOwe;anoDid
# Conserved (-c) taeGut;ficAlb;pseHum;corBra;melUnd;falPer;picPub;lepDis;halLeu;aptFor;pygAde;fulGla;nipNip;balReg;chaVoc;calAnn;chaPel;cucCan;colLiv;mesUni;galGal;melGal;anaPla;cryCin;tinGut;eudEle;notPer
# Outgroups (-g) allMis;allSin;croPor;gavGan;chrPic;cheMyd;anoCar
# -----------------------------------------------------------------------------------------------------------------------------
# CLUSTER OPTIONS:
# Option Setting
# Partition(s) core28
# Number of nodes 1
# Max mem per job (gb) 4
# Time per job 1:00:00
# -----------------------------------------------------------------------------------------------------------------------------
# OPTIONS INFO:
# Option Current setting Current action
# -i: /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/id-subset.txtOnly loci names specified in this file will be tested.
# -r: st All loci will be run with the species tree model of PhyloAcc
# -burnin: 500 This number of steps in the chain will discarded as burnin
# -mcmc: 1000 The number of steps in each chain
# -chain: 1 The number of chains to run
# Loci per batch (-batch) 5 PhyloAcc will run this many loci in a single command.
# Current processes (-n) 4 This interface will use this many processes.
# Jobs (-j) 2 PhyloAcc will submit this many jobs concurrently.
# Processes per job (-p) 1 Each job will use this many processes.
# --summarize False PhyloAcc batch files will be generated and written to the job directory specified above.
# --theta False A species tree with branch lengths in coalescent units will NOT be estimated.
# --quiet False Time, memory, and status info will be printed to the screen while PhyloAcc is running.
# -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# Date Time Current step Status Elapsed time (s) Step time (s) Current mem usage (MB) Virtual mem usage (MB)
# -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
# 10.14.2022 09:32:52 Detecting compression of seq file Success: No compression detected 0.42472 0.03124 70.62109 4117.99219
# 10.14.2022 09:32:52 Reading input FASTA Success: 43 seqs read 0.43554 0.01041 73.71875 4118.1875
# 10.14.2022 09:32:52 Reading locus IDs Success: 10 IDs read 0.43591 0.00021 73.71875 4118.1875
# 10.14.2022 09:32:53 Detecting compression of bed file Success: No compression detected 0.44941 0.01338 73.71875 4118.1875
# 10.14.2022 09:32:53 Reading input bed file Success: 9 loci read 0.44991 0.00039 73.71875 4118.1875
# 10.14.2022 09:32:53 Partitioning alignments by locus Success: 9 alignments partitioned 0.45018 0.00016 73.71875 4118.1875
# 10.14.2022 09:32:53 Calculating alignment stats Success: 9 alignments processed 0.46598 0.01568 73.98828 4118.1875
# 10.14.2022 09:32:53 Writing: phyloacc-aln-stats.csv Success: align stats written 0.46671 0.00046 73.98828 4118.1875
# 10.14.2022 09:32:53 Writing PhyloAcc job files Success: 2 jobs written 0.47017 0.0033 73.98828 4118.1875
# 10.14.2022 09:32:53 Writing Snakemake file Success: Snakemake file written 0.47062 0.00028 74.03906 4118.1875
# 10.14.2022 09:32:53 Writing Snakemake config file Success: Snakemake config written 0.47097 0.00023 74.03906 4118.1875
# 10.14.2022 09:32:53 Writing Snakemake cluster profile Success: Snakemake profile written 0.47146 0.00038 74.03906 4118.1875
# 10.14.2022 09:32:53 Generating summary plots Success 1.43354 0.96191 97.5625 4307.90625
# 10.14.2022 09:32:53 Writing HTML summary file Success 1.43424 0.00045 97.5625 4307.90625
# ===============================================================================================================================================================================
#
# Done!
# The date and time at the end is: 10.14.2022 | 09:32:53
# Total execution time: 1.434 seconds.
# Output directory for this run: phyloacc-test
# Log file for this run: phyloacc-test/phyloacc-test.log
# Alignment stats file: phyloacc-test/phyloacc-aln-stats.csv
# HTML summary file: phyloacc-test/phyloacc-pre-run-summary.html
#
# PhyloAcc job files successfully generated
# Run the following command to run the PhyloAcc batches:
snakemake -p -s /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/run_phyloacc.smk --configfile /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/phyloacc-config.yaml --profile /gpfs/home/liunyw/biosoft/PhyloAcc-test-data/phyloacc-test/phyloacc-job-files/snakemake/profiles/slurm_profile --dryrun
# Then, if everything looks right, remove --dryrun to execute
# You may also want to start your favorite terminal multiplexer (e.g. screen, tmux)
# ===============================================================================================================================================================================
#
Hi Yawen,
I think I found the problem: the phyloacc_post.py
script is still trying to use an old method for reading the trees. In fact I didn't even add the parameter to read it with the new method, so that's the actual error that is occurring. I will try to post an update sometime today or tomorrow and I'll let you know here when that goes through.
That's great! Thanks for your help!
Hey sorry for the slow response regarding the post-processing script. The PR with the updated version was stuck in the bioconda queue for a few days. Version 2.1.0 is up now and includes phyloacc_post.py
, so you can try conda update phyloacc
or just reinstalling it in a fresh environment and the script should now be callable.
Hello,
I have updated phyloacc to version 2.1.0, and phyloacc_post.py
is working successfull !
Thanks for your kind help!
Yawen
Dear @xyz111131 ,
Thank you for developing such great software! I found that the
-part "[STRING]"
of Cluster options is about Slurm, however the cluster I am using is PBS, how should I set it?Yawen