can't open file '/main.py' on container

jonppe commented 10 months ago

Moving this here from the upstream funsearch repo.

          Hello,

I made somewhat simple implementation of the main missing components: https://github.com/jonppe/funsearch
* There's a 'Sandbox' implementation using a container with Podman or Docker
Can you add issues to your repository? for example if we "run the main Python process on a host computer outside of any container and let the process build and run separate sandbox containers (still requires Podman/Docker)." the error is now in the sandbox:

/usr/local/bin/python3: can't open file '/main.py': [Errno 13] Permission denied

any help would be appreciated.

Originally posted by @Rubiel1 in https://github.com/google-deepmind/funsearch/issues/1#issuecomment-1902866554

jonppe commented 10 months ago

@Rubiel1 do you have Docker or Podman installed. Which version? Are you running Windows or Linux?

You could try setting the environment variable: LOGLEVEL=debug and then re-running the command.

I'd check especially the command that looks something like this: DEBUG:root:Executing: podman run --stop-timeout=30 -v /home/johannes/..../funsearch/funsearch/container/container_main.py:/main.py:ro -v /home/johannes/..../funsearch/data/1705907648/sandbox0/call1:/workspace -v /home/johannes/..../funsearch/data/1705907648/sandbox0/inputs/11.pickle:/input.pickle:ro funsearch_sandbox:latest /usr/local/bin/python3 /main.py /workspace/prog.pickle /input.pickle /workspace/output.pickle 2> data/1705907648/sandbox0/stderr_1.log

And then check also the the error file 'stderr_1.log' in this case. But perhaps the log has other interesting parts too.

I haven't paid quite as much attention on Windows and Docker support so maybe there's something I missed. The command tries to mount the container_main.py as /main.py inside the container but perhaps this doesn't work with some kind of folder structure or folder names in Windows...

Rubiel1 commented 10 months ago

Hello, I use Fedora 38 with 62.6 GiB of RAM.

I erased everything and installed funsearch again, and after podman run... (in the workspace) I run llm install llm-gpt4all which allowed me to specify the model --model_name orca-mini-3b-gguf2-q4_0 After that the program runs, however, it eventually crashes, after writing the 61 prompt message.

The testing parameters of the last experiment were

functions_per_prompt: int = 2
  num_islands: int = 4 
  reset_period: int = 4 * 60 * 60
  cluster_sampling_temperature_init: float = 0.1
  cluster_sampling_temperature_period: int = 30_000
  backup_period: int = 10
  backup_folder: str = './data/backups'

and

 programs_database: ProgramsDatabaseConfig = dataclasses.field(
      default_factory=ProgramsDatabaseConfig)
  num_samplers: int = 1
  num_evaluators: int = 1
  samples_per_prompt: int = 4

with messages

DEBUG:root:Executing: python /workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py /workspace/data/1705991385/sandbox0/call60/prog.pickle /workspace/data/1705991385/sandbox0/inputs/11.pickle /workspace/data/1705991385/sandbox0/call60/output.pickle  2> data/1705991385/sandbox0/stderr_60.log
DEBUG:root:Writing program to /workspace/data/1705991385/sandbox0/call60/program.py
Segmentation fault (core dumped)

Here is the stderr_60.log

sandbox0# cat stderr_60.log 
Traceback (most recent call last):
  File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 59, in _wrapfunc
    return bound(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'NoneType'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py", line 27, in <module>
    main(sys.argv[1], sys.argv[2], sys.argv[3])
  File "/workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py", line 18, in main
    ret = func(input_data)
          ^^^^^^^^^^^^^^^^
  File "<ast>", line 17, in evaluate
  File "<ast>", line 37, in solve
  File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 1229, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 68, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspace/.venv/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 45, in _wrapit
    result = getattr(asarray(obj), method)(*args, **kwds)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'NoneType' and 'NoneType'

Now in another run it reached call 121, but only 61 prompts /responses

DEBUG:root:Writing program to /workspace/data/1705997282/sandbox0/call120/program.py
DEBUG:root:Executing: python /workspace/.venv/lib/python3.11/site-packages/funsearch/container/container_main.py /workspace/data/1705997282/sandbox0/call121/prog.pickle /workspace/data/1705997282/sandbox0/inputs/4.pickle /workspace/data/1705997282/sandbox0/call121/output.pickle  2> data/1705997282/sandbox0/stderr_121.log
DEBUG:root:Writing program to /workspace/data/1705997282/sandbox0/call121/program.py
Segmentation fault (core dumped)

jonppe commented 10 months ago

I can reproduce the issue. This seems to be something related to the llm-gpt4all. It seems to happen on the 61th repeated prompt even without funsearch involved e.g.

for i in range(70): ... model.prompt("How are you?") ... <Response prompt='How are you?' text=' As an AI, I don't have feelings or emotions like humans do. However, I am always ready and available to assist you in any way possible.'> Segmentation fault (core dumped)

I couldn't immediately find any reported issues about it. I'll do a few more tests is this related to running inside container... perhaps some resource is running out.

jonppe commented 9 months ago

Btw, I created the issue https://github.com/simonw/llm-gpt4all/issues/22 for the Segmentaion fault.

And in the latest funsearch version, improved the parsing to find the response properly from orca-mini-3b-gguf2-q4_0 responses. Anyway, at least with me the indentation in orca responses is usually 1 space. Adding something like response.replace("\n ", "\n ") could be a workaround for orca but proper fix might require a bit more work.

jonppe / funsearch

can't open file '/main.py' on container #1