Closed charstnut closed 3 years ago
I hotfixed this in the new branch in config improvements. Just as a reference.
I cannot reproduce that error on my desktop. What is your script? How many parallel jobs did you want to use and how much RAM do you have?
@MCFlowMace The script for generation is pasted below. Can you pull from the hotfix I posted and test if that works on your desktop? The fix is basically moving the locustcommands.sh
script to each subdirectory where the sim output goes. I think this is related to two docker container calling the same script loaded from mounted volume. The parallel job count is 2 in this case.
### This file generates the local template set.
import os
import hercules as he
import numpy as np
from pathlib import Path
config_file = Path(os.path.dirname(__file__)).joinpath("hercules_config.ini")
working_dir = Path("/").joinpath("mnt", "d", "Data")
# working_dir = Path("~/TempData").expanduser()
sim = he.KassLocustP3(working_dir, config_file)
# Does the number of channels automatically form a ring? Seems yes
n_channels = 60
r_range = np.linspace(0.002, 0.008, 8)
theta_range = np.linspace(89.7, 90.0, 30)
# r_range = np.linspace(0, 0.01, 1)
# theta_range = np.linspace(89.7, 90.0, 2)
r_phi_range = np.linspace(0, 2 * np.pi / 60, 1)
config_list = []
for theta in theta_range:
for r_phi in r_phi_range:
for r in r_range:
x = r * np.cos(r_phi)
y = r * np.sin(r_phi)
r_phi_deg = np.rad2deg(r_phi)
name = "Sim_theta_{:.4f}_R_{:.4f}_phi_{:.4f}".format(
theta, r, r_phi_deg)
config = he.SimConfig(name,
n_channels=n_channels,
seed_locust=42,
seed_kass=43,
egg_filename="simulation.egg",
x_min=x,
x_max=x,
y_min=y,
y_max=y,
z_min=0,
z_max=0,
theta_min=theta,
theta_max=theta,
t_max=5e-6,
v_range=3.0e-7,
presample_spacing=150000,
geometry='FreeSpaceGeometry_V00_00_10.xml')
config_list.append(config)
sim(config_list)
For me the script above runs fine even without your fix. Again the question, how much RAM do you have? Other than that it could indeed be due to the mounted drive, which I don't have. By the way do you run it on Linux or with WSL?
Could you isolate your hotfix from your other work on a new branch that we can merge separately?
My RAM is 16GB and it is not fully utilized when I implement the hotfix and run 2 jobs in parallel. I've also tested the mnt drive, and since I am using WSL, even when the path is something like ~/TempData
the script still did not work. I am going to create a hotfix branch for this issue. If you think the hotfix works on your end as well we can merge it then. I think this might be bugs in Windows Docker for WSL environment but I'm not sure, since you can't reproduce it.
It seems that if the desktop job number is greater than 1, the locust in the container fails before completing. Here is the log output from locust.