Open amitpendharkar opened 3 weeks ago
Hi Amit; thanks for sharing this. I'll try to find a more clean way that satisfies both.
In the meantime if you set it with os.environ before you import mdio functions it should still work for scenario 2. Can you please try and let me know?
Issue
The value in environment variable
MDIO__IMPORT__CPU_COUNT
should be used to limit number of processed spawned byProcessPoolExecutor
. However, this value does not get used in a particular situation.In following two scenarios, it get's used in scenario 1 but not in scenario 2. Scenario 1: Set the environment variable
MDIO__IMPORT__CPU_COUNT
and then run the script that invokessegy_to_mdio
function : Works E.g. Launch a pod with environment variableMDIO__IMPORT__CPU_COUNT
already set and then run the scriptScenario 2: Run the script, set environment variable
MDIO__IMPORT__CPU_COUNT
in that script and then invokesegy_to_mdio
function : Does not work E.g. Launch a pod. Use argument sent to the script to set the environment variableMDIO__IMPORT__CPU_COUNT
in the script and then invoke thesegy_to_mdio
functionThis happens because
NUM_CPUS
value gets updated when the code is loaded in memory before execution starts thus requiring environment variable to be set before running the script.Suggested solution
Re-read the value of
MDIO__IMPORT__CPU_COUNT
just before following line and save it in NUM_CPUs. This will ensure that Scenario 2 would also work. https://github.com/TGSAI/mdio-python/blob/14757936e914589283233dedd9f55f87b1a95a6f/src/mdio/segy/blocked_io.py#L122