billiemaguire commented 1 year ago

I am having an issue where certain ipyrad steps, particularly the assembly step, will never stop running. I can create the params file just fine, but the following steps will stall indefinitely. Uninstalling ipyrad or creating a new conda environment will sometimes solve the problem temporarily.

I've installed miniconda and ipyrad on my unviersity's HPC like this:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh bash conda info

conda create —name ipyrad28 conda activate ipyrad28

conda config --set solver classic conda install -n base conda-libmamba-solver conda config --set solver libmamba conda install -v ipyrad -c conda-forge -c bioconda

I've written this at the top of my sbatch script:

!/bin/bash

SBATCH -p cpu-long # partition

SBATCH -c 12 # cpu

SBATCH --mem=100G # Requested Memory

SBATCH -o ipy5-123b # %j = job ID

module load miniconda

eval "$(/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/bin/conda shell.bash hook)"

source /work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/

conda activate ipyrad28

Any insight into what might be causing this issue would be greatly appreciated.

Thank you much, Billie

isaacovercast commented 1 year ago

Hello Billie,

Thanks for providing all the great detail in your issue, it helps a lot.

This sounds like an HPC timing issue, which is somewhat common on HPC systems. You can get around this by launching the ipyparallel cluster by hand, prior to running ipyrad. Docs for this are here: https://ipyrad.readthedocs.io/en/latest/HPC_script.html#optional-controlling-ipcluster-by-hand

billiemaguire commented 1 year ago

Hi Issac,

Thanks so much for your reply. I tried what you suggested and I get "OSError: [Errno 122] Disk quota exceeded" when I run "ipcluster start --n 20 --daemonize"

Here's what I did:

william_maguire002_umb_edu@login5:/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts$ srun -c 9 -p cpu --pty bash srun: job 7408078 queued and waiting for resources srun: job 7408078 has been allocated resources

william_maguire002_umb_edu@cpu022:/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts$ eval "$(/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/bin/conda shell.bash hook)"

william_maguire002_umb_edu@cpu022:/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts$ source /work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/ bash: source: /work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/: is a directory

william_maguire002_umb_edu@cpu022:/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts$ conda activate ipyrad5-4

(ipyrad5-4) william_maguire002_umb_edu@cpu022:/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts$ ipcluster start --n 20 --daemonize 2023-05-08 16:02:44.669 [IPClusterStart] Starting ipcluster with [daemonize=True] Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x7f2a2cdae590>>, <Task finished name='Task-1' coro=<IPClusterStart.start_cluster() done, defined at /work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/ipyparallel/cluster/app.py:566> exception=OSError(122, 'Disk quota exceeded')>) Traceback (most recent call last): File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/tornado/ioloop.py", line 738, in _run_callback ret = callback() File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/tornado/ioloop.py", line 762, in _discard_future_result future.result() File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/ipyparallel/cluster/app.py", line 567, in start_cluster await self.cluster.start_cluster() File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/ipyparallel/cluster/cluster.py", line 776, in start_cluster await self.start_controller() File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/ipyparallel/cluster/cluster.py", line 654, in start_controller r = self.controller.start() File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/ipyparallel/cluster/launcher.py", line 669, in start return super().start() File "/work/pi_mayra_cadorinvidal_umb_edu/epi/scripts/bash/envs/ipyrad5-4/lib/python3.10/site-packages/ipyparallel/cluster/launcher.py", line 523, in start with open(self.output_file, "ab") as f, open(os.devnull, "rb") as stdin: OSError: [Errno 122] Disk quota exceeded: '/home/william_maguire002_umb_edu/.ipython/profile_default/log/ipcontroller-516692.log'

isaacovercast commented 1 year ago

Well, that all looks fine, except the error message indicates you are out of disk space in your home directory. This could be what the real problem has been all along. Check on your disk usage and what your quota is. You'll probably have to remove some stuff from your home directory to make room for the ipyparallel log files (they aren't big, but moving them to a different location would be annoying). Good luck and let me know how it goes.

billiemaguire commented 1 year ago

That works! Thank you so much for your help. I really appreciate it.

dereneaton / ipyrad

ipyrad assembly step (1) stalls indefinitely #506

!/bin/bash

SBATCH -p cpu-long # partition

SBATCH -c 12 # cpu

SBATCH --mem=100G # Requested Memory

SBATCH -o ipy5-123b # %j = job ID

module load miniconda