Open jonppe opened 10 months ago
@Rubiel1 do you have Docker or Podman installed. Which version? Are you running Windows or Linux?
You could try setting the environment variable: LOGLEVEL=debug and then re-running the command.
I'd check especially the command that looks something like this: DEBUG:root:Executing: podman run --stop-timeout=30 -v /home/johannes/..../funsearch/funsearch/container/container_main.py:/main.py:ro -v /home/johannes/..../funsearch/data/1705907648/sandbox0/call1:/workspace -v /home/johannes/..../funsearch/data/1705907648/sandbox0/inputs/11.pickle:/input.pickle:ro funsearch_sandbox:latest /usr/local/bin/python3 /main.py /workspace/prog.pickle /input.pickle /workspace/output.pickle 2> data/1705907648/sandbox0/stderr_1.log
And then check also the the error file 'stderr_1.log' in this case. But perhaps the log has other interesting parts too.
I haven't paid quite as much attention on Windows and Docker support so maybe there's something I missed. The command tries to mount the container_main.py as /main.py inside the container but perhaps this doesn't work with some kind of folder structure or folder names in Windows...
Hello, I use Fedora 38 with 62.6 GiB of RAM.
I erased everything and installed funsearch again, and after podman run...
(in the workspace) I run
llm install llm-gpt4all
which allowed me to specify the model
--model_name orca-mini-3b-gguf2-q4_0
After that the program runs, however, it eventually crashes, after writing the 61 prompt message.
The testing parameters of the last experiment were
functions_per_prompt: int = 2
num_islands: int = 4
reset_period: int = 4 * 60 * 60
cluster_sampling_temperature_init: float = 0.1
cluster_sampling_temperature_period: int = 30_000
backup_period: int = 10
backup_folder: str = './data/backups'
and
programs_database: ProgramsDatabaseConfig = dataclasses.field(
default_factory=ProgramsDatabaseConfig)
num_samplers: int = 1
num_evaluators: int = 1
samples_per_prompt: int = 4
with messages
DEBUG:root:Executing: python /workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py /workspace/data/1705991385/sandbox0/call60/prog.pickle /workspace/data/1705991385/sandbox0/inputs/11.pickle /workspace/data/1705991385/sandbox0/call60/output.pickle 2> data/1705991385/sandbox0/stderr_60.log
DEBUG:root:Writing program to /workspace/data/1705991385/sandbox0/call60/program.py
Segmentation fault (core dumped)
Here is the stderr_60.log
sandbox0# cat stderr_60.log
Traceback (most recent call last):
File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 59, in _wrapfunc
return bound(*args, **kwds)
^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'NoneType'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py", line 27, in <module>
main(sys.argv[1], sys.argv[2], sys.argv[3])
File "/workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py", line 18, in main
ret = func(input_data)
^^^^^^^^^^^^^^^^
File "<ast>", line 17, in evaluate
File "<ast>", line 37, in solve
File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 1229, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 68, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 45, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'NoneType'
Now in another run it reached call 121, but only 61 prompts /responses
DEBUG:root:Writing program to /workspace/data/1705997282/sandbox0/call120/program.py
DEBUG:root:Executing: python /workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py /workspace/data/1705997282/sandbox0/call121/prog.pickle /workspace/data/1705997282/sandbox0/inputs/4.pickle /workspace/data/1705997282/sandbox0/call121/output.pickle 2> data/1705997282/sandbox0/stderr_121.log
DEBUG:root:Writing program to /workspace/data/1705997282/sandbox0/call121/program.py
Segmentation fault (core dumped)
I can reproduce the issue. This seems to be something related to the llm-gpt4all. It seems to happen on the 61th repeated prompt even without funsearch involved e.g.
for i in range(70): ... model.prompt("How are you?") ... <Response prompt='How are you?' text=' As an AI, I don't have feelings or emotions like humans do. However, I am always ready and available to assist you in any way possible.'> Segmentation fault (core dumped)
I couldn't immediately find any reported issues about it. I'll do a few more tests is this related to running inside container... perhaps some resource is running out.
Btw, I created the issue https://github.com/simonw/llm-gpt4all/issues/22 for the Segmentaion fault.
And in the latest funsearch version, improved the parsing to find the response properly from orca-mini-3b-gguf2-q4_0 responses. Anyway, at least with me the indentation in orca responses is usually 1 space. Adding something like response.replace("\n ", "\n ") could be a workaround for orca but proper fix might require a bit more work.
Moving this here from the upstream funsearch repo.
any help would be appreciated.
Originally posted by @Rubiel1 in https://github.com/google-deepmind/funsearch/issues/1#issuecomment-1902866554