Open water-e opened 6 months ago
I like the idea of logger per agency and also logging different loggers to different files to avoid file contention. I don't want to go down the route of a logging service (which can handle multiple processes) but requires a running service (Azure and other cloud providers do this).
I can create context based logs (populate_cdec.log, reformat_usbr.log). That will take care of the across-agency issue which is the one that was already there.
On the per-row multithreading, there are two ways to go:
I agree solving the issue of logging in a fine grain multiprocessor context sounds awfully complicated.
The other thing is that these downloaders and reformatters have some utility outside of the populate_repo setting. The lighter weight they are the better.
From: Nicky Sandhu @.> Sent: Friday, January 5, 2024 8:13 AM To: CADWRDeltaModeling/dms_datastore @.> Cc: Ateljevich, @. @.>; Author @.***> Subject: Re: [CADWRDeltaModeling/dms_datastore] logging is not multi-processor safe (Issue #35)
I like the idea of logger per agency and also logging different loggers to different files to avoid file contention. I don't want to go down the route of a logging service (which can handle multiple processes) but requires a running service (Azure and other cloud providers do this).
— Reply to this email directly, view it on GitHubhttps://github.com/CADWRDeltaModeling/dms_datastore/issues/35#issuecomment-1878918145, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG2AJC6YEV2XAWL7J54SWPTYNARC7AVCNFSM6AAAAABBKI5VW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZYHEYTQMJUGU. You are receiving this because you authored the thread.Message ID: @.***>
python logging is thread safe but not multiprocessor safe. There is also a file contention issue ... I can't remember what two things I tried to run at the same time, but the second launch didn't work because the logging file was busy.
This is not a new issue, just a growing one. The introduction of logging into the older "by agency" multiprocessing introduces the possibility and I'm sure printing messages was even worse. But these things are changing
As an aside, the use of a single logger possibly locks up how much of the library can be used at once. I know I use download_xxx for other things (e.g. downlod of mission-critical items for SCHISM). This is an incentive to allow more loggers, not a module level logger.
The topic of multi-processor safe writing is covered in a couple places, though I can't vouch for the info: https://docs.python.org/dev/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes
If we split up the logging and don't do logging in multi-processed funcitons we could also create multiple loggers or a pool of them. But the logging plan should probably preceed any more work with multiprocessing. This probably affects reformat as well.