Closed ablot closed 1 year ago
Does it say anywhere in the stracktrace something about Loop devices? I think (though am not sure) this looks similar to what I get with that error. In this case exiting images instances can be closed with
singularity instance list
(see list of names of existing instances)
singularity instance stop <instance name>
You can also try to clean the cache: https://github.com/SpikeInterface/spikeinterface/blob/main/src/spikeinterface/sorters/external/tests/test_singularity_containers.py#L22
So far no luck with either of those. I'll try to understand the issue more at the end of the week
Is there a minimum version of singularity that is compatible? (ours is a bit old 3.6.4)
We're using version 3.8.7 in our tests: https://github.com/SpikeInterface/spikeinterface/blob/main/.github/workflows/test_containers_singularity.yml
I'm actually unsure if this is a binding issue or not. This error happens only if I export SPIKEINTERFACE_DEV_PATH
and in there is ERROR ['FATAL: while parsing bind path: while getting bind path: is not a valid bind option\n'] : return code 255
in the traceback (before the "no instance found")
I've checked what the volumes
dictionnary was before starting the container and it looks fine to me:
{
"/nemo/lab/znamenskiyp/data/instruments/raw_data/projects/blota_onix_pilote/BRYA142.5d/S20231002/R121114_onix": {
"bind": "/nemo/lab/znamenskiyp/data/instruments/raw_data/projects/blota_onix_pilote/BRYA142.5d/S20231002/R121114_onix",
"mode": "ro"
},
"/nemo/lab/znamenskiyp/home/shared/projects/blota_onix_pilote/BRYA142.5d/S20231002/R121114_onix/verboseTrue_devTrue": {
"bind": "/nemo/lab/znamenskiyp/home/shared/projects/blota_onix_pilote/BRYA142.5d/S20231002/R121114_onix/verboseTrue_devTrue",
"mode": "rw"
},
"/nemo/lab/znamenskiyp/home/users/blota/code/spikeinterface": {
"bind": "/nemo/lab/znamenskiyp/home/users/blota/code/spikeinterface",
"mode": "ro"
}
}
I don't have that issue if I don't export the environment variable. I'll switch to the non-dev for now
I'm not making much progress. I now have
Traceback (most recent call last):
File "/nemo/lab/znamenskiyp/home/users/blota/code/spikeinterface/test_container_mini.py", line 7, in <module>
sorting = ss.run_sorter(
File "/nemo/lab/znamenskiyp/home/users/blota/code/spikeinterface/src/spikeinterface/sorters/runsorter.py", line 142, in run_sorter
return run_sorter_container(
File "/nemo/lab/znamenskiyp/home/users/blota/code/spikeinterface/src/spikeinterface/sorters/runsorter.py", line 530, in run_sorter_container
container_client = ContainerClient(
File "/nemo/lab/znamenskiyp/home/users/blota/code/spikeinterface/src/spikeinterface/sorters/runsorter.py", line 300, in __init__
raise FileNotFoundError(
FileNotFoundError: Unable to locate container image spikeinterface/kilosort3-compiled-base
The output starts with:
Singularity: pulling image spikeinterface/kilosort3-compiled-base
singularity pull --name kilosort3-compiled-base.sif docker://spikeinterface/kilosort3-compiled-base
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
Getting image source signatures
Copying blob sha256:47c7644723910b6dfc6ec8b3bd9fed3ac32778cf485ce3a6535ff6b6da06f743
Copying blob sha256:85aaf046f0365a57a54dc3f66ba5dfa79e928e885b0705214fe1b5b3ce148438
Copying blob sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1
panic: page 3 already freed
Do you know what is the panic about?
Not sure honestly! :(
This was a cache issue. My system was using .local/share/containers/cache/
which apparently is not emptied by singularity cache clean
as it could be used by other platforms.
Since I upgrade to spikeinterface >0.9 I struggle with singularity. After I fix the binding issues (#2059), I have another crash that I don't really understand:
I assume that it is an issue with the
ContainerClient
class but I'm not sure where to look. Do you have any idea?