sgkit-dev / bio2zarr

Convert bioinformatics file formats to Zarr
Apache License 2.0
26 stars 7 forks source link

vcf2zarr leaks semaphore objects #209

Open tomwhite opened 5 months ago

tomwhite commented 5 months ago

From https://github.com/sgkit-dev/bio2zarr/issues/201#issuecomment-2112045205

On a Mac:

vcf2zarr convert sample.vcf.gz sample.zarr -p 1
    Scan: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00/1.00 [00:00<00:00, 2.57files/s]
 Explode: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.00/9.00 [00:00<00:00, 22.8vars/s]
  Encode: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 927/927 [00:00<00:00, 1.62kB/s]
Finalise: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.0/21.0 [00:00<00:00, 1.64karray/s]
/Users/tom/miniconda3/envs/bio2zarr/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 3 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
jeromekelleher commented 5 months ago

Can you try that with just explode, and with different numbers of workers please?

vcf2zarr convert sample.vcf.gz sample.icf
tomwhite commented 5 months ago
vcf2zarr explode sample.vcf.gz sample.icf
Do you want to overwrite sample.icf? (use --force to skip this check) [y/N]: y
    Scan: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00/1.00 [00:00<00:00, 2.48files/s]
 Explode: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.00/9.00 [00:00<00:00, 23.1vars/s]
/Users/tom/miniconda3/envs/bio2zarr/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 2 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

vcf2zarr explode sample.vcf.gz sample.icf -p 3
Do you want to overwrite sample.icf? (use --force to skip this check) [y/N]: y
    Scan: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.00/1.00 [00:00<00:00, 2.51files/s]
 Explode: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.00/9.00 [00:00<00:00, 20.8vars/s]
/Users/tom/miniconda3/envs/bio2zarr/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 4 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
jeromekelleher commented 5 months ago

Hmm, seems to be something to do with tqdm:

jeromekelleher commented 5 months ago

I had a go at this in #214 - would you mind trying it out and seeing if it resolves this problem please @tomwhite?

tomwhite commented 5 months ago

Still seeing the warning with the new code...

jeromekelleher commented 5 months ago

Hmm, ok thanks. I might need a bit of help with this one then, it's too obscure to track down with being able to reproduce. Any chance you could take a look?🙏

tomwhite commented 5 months ago

I tried removing all the tqdm code (including imports) and I still get the warning. I'll keep looking, but I'm not really sure what is going on here!

jeromekelleher commented 4 months ago

Great to know that tqdm isn't causing this! I guess it must be something to do with the locks associated with the multiprocessing.Value. Is there anyway we can get more detailed feedback on where these semaphores are being leaked?

jeromekelleher commented 4 months ago

Did you have any luck tracking this down @tomwhite? Some things that would be useful to try:

I'd really like to get rid of this...

jeromekelleher commented 4 months ago

Can we just capture the warning in main, as a workaround also?

jeromekelleher commented 4 months ago

Looks like this is some python 3.9 on mac quirk - I've reproduced on CI on both ARM and intel: Screenshot from 2024-05-25 23-09-09

jeromekelleher commented 4 months ago

I made an attempt to catch the warnings in #226, but it's tricky to do this via CI. I'm sure the warnings are harmless, and as it's only on Python 3.9 I don't think we need to make it a release blocker. Would be nice to just suppress the warning, at the same time, though.

jeromekelleher commented 4 months ago

https://discuss.pytorch.org/t/issue-with-multiprocessing-semaphore-tracking/22943

Some comments here on how to suppress.

jeromekelleher commented 4 months ago

Just to update here that I'm working on tracking this down. It's a doozy...

jeromekelleher commented 4 months ago

I've tried lots of different ways to resolve this, and have come to the conclusion that it's something quite specific to Python 3.9 on Macs. Given that we don't leak semaphores on later Python versions it seems likely to me that this is an underlying bug in Python, and (given the substantial effort in trying to find workarounds) there's likely not much we can do about it.

So, the conclusion is to mark this as a known issue, and to document the problem, suggesting that users move Python version if they are going to be doing serious work on their macs.

tomwhite commented 4 months ago

I way away last week, so thanks for tracking this down and documenting everything!

I agree that it's fine to document the Python 3.9 limitation. Python 3.9 won't be around much longer anyway (https://scientific-python.org/specs/spec-0000/).