Closed yit-b closed 3 months ago
Good time of day. But you already provide a correct way how to do this with
def heif_init():
from pillow_heif import register_heif_opener
register_heif_opener()
with mp.Pool(1, initializer=heif_init) as pool:
pool.map(sample_image_transform, image_bytes)
The same way it is done in FastAPI applications:
@asynccontextmanager
async def lifespan(app: FastAPI): # code executes in each subprocess of webserver
register_heif_opener()
yield
Pillow itself requires for each subprocess to register plugins, it does not have automatic plugin registration for security reasons.
Thanks for the quick response and clarification for the behavior. Will proceed with the per-subprocess initialization technique.
You're welcome, always happy to help
Describe why it is important and where it will be useful
When decoding images with a torch dataloader (or more generally a multiprocessing pool) and mp/torch start_method = "spawn", needing to register the heif opener per-process (e.g. in an initializer or worker_init_fn) is a bit of a gotcha. Calling register_heif_opener() in the global scope of your program is not enough as the plugin gets unregistered after the spawn.
Repro:
Output:
Describe your proposed solution
I'm not sure how you'd do this - open to discussion.
Describe alternatives you've considered, if relevant
If I explicitly call register_heif_opener() in the initializer of my mp pools or torch dataloaders, then there's no issue. But that's a bit easy to forget and causes difficult-to-debug errors.
I'm not sure how to persist imports after a spawn but I believe some libraries e.g. torch do it somehow.
Additional context
No response