Closed mhelabd closed 2 years ago
I think your running env looks good. The error is here, basically warpdrive does not find your environment step() source code in .cu so it cannot initialize your step function
self._cuda_functions[fname] = self._CUDA_module.get_function(fname)
pycuda._driver.LogicError: cuModuleGetFunction failed: named symbol not found
Inside the code, it happens here In the Foundation wrapper you had the following. You have to have a step kernel function called f"Cuda{self.name}Step" in a .cu source code file, and registered under env_registra with its absolute path.
self.cuda_function_manager.compile_and_load_cuda(
env_name=self.name,
template_header_file="template_env_config.h",
template_runner_file="template_env_runner.cu",
customized_env_registrar=customized_env_registrar,
)
print("initialize_functions...")
step_function = f"Cuda{self.name}Step"
self.cuda_function_manager.initialize_functions([step_function])
self.env.cuda_step = self.cuda_function_manager.get_function(step_function)
Please let me know if you have any problem, I am more than happy to help.
Thanks for your question, @mhelabd
Adding on @Emerald01 's response
For running the simple-wood-and-stone
environment with WarpDrive, you would first need to create a CUDA version of the environment. To get started, please see our tutorial: https://github.com/salesforce/ai-economist/blob/master/tutorials/multi_agent_gpu_training_with_warp_drive.ipynb. That shows how to build and train your environment end-to-end with WarpDrive, and also points out nuances like how to name your GPU kernels.
Also, in your current training script, you are pointing to "../foundation/scenarios/covid19/covid19_build.cu", which only contains the paths to the source files for the covid and economy environment, but not the simple_wood_and_stone
.
In fact, we do not yet have a CUDA C version of the wood-and-stone environment that can run on a GPU with WarpDrive. If you would like to contribute to that environment, we would love to add it to the repository. Happy to answer any other questions. Thanks.
I am currently running a training script using warp-drive.
I have my environment initialized in this dockerfile.
When running my training_script, I get the following error:
python training_script.py --env simple_wood_and_stone
was wondering if someone ran into this before or has any idea how to fix it?