EPCCed / wee_archie

BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Issue with Running Windtunnel example on a supercomputer #41

Closed cstyl closed 1 year ago

cstyl commented 1 year ago

Hi,

I have been trying to run the windyunnel example but instead of using a Raspberry Pi Cluster I am using a supercomputer, again with login and compute nodes.

I have followed the process as discussed in the Readme.md and adapted the hostfile etc accordingly to match the information of my system. Note that I am launching an interactive job and use the name of that node to the hostfile.

However, I haven't really understood what the simulations.db should do and how to create one or what it should contain. At the moment I am creating an empty on by sqlite3 simulations.db and then exiting the file.

When I try to run the GUI from my laptop such that it runs the simulation on the compute node of the interactive job, first the GUI shows up on my screen but when I click to run the simulation I get the following error on my local terminal:

pythonw Start.py -s http://myip:5000/ -c 5
Server initialised for simulation 'CDFD'.
Server is http://myip:5000/
Aerofoil
Created parameter file
http://myip:5000/simulation/CDFD
Simulation Started. ID=f8400cba-86c0-46b8-bba9-3fc25f3dbc2d
Process initiated
Process Process-1:
Traceback (most recent call last):
  File "/Volumes/work/miniconda3/envs/windtunnel/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/Volumes/work/miniconda3/envs/windtunnel/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "../framework/client/process.py", line 21, in process
    status=servercomm.GetStatus()
  File "../framework/client/servercomm.py", line 39, in GetStatus
    self.status=json.loads(statusrequest.text)
  File "/Volumes/work/miniconda3/envs/windtunnel/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/Volumes/work/miniconda3/envs/windtunnel/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Volumes/work/miniconda3/envs/windtunnel/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

and on the server side I see:

Press CTRL+C to quit
- - [26/Jul/2023 17:02:26] "POST /simulation/CDFD HTTP/1.1" 201 -
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/nvme/h/cy21cs1/miniconda3/envs/windtunnel/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/nvme/h/cy21cs1/wee_archie/framework/server/simulation_runner.py", line 58, in run
    executablefile = conn.getSimulationExecutable(self.kwargs['simid'])
  File "/nvme/h/cy21cs1/wee_archie/framework/server/frameworkdb.py", line 110, in getSimulationExecutable
    cursor.execute('SELECT ExecutionPrefix, ExecutionHostFile, Executable, ChangeToDir FROM Simulations WHERE SIMID=?', (simulationid, ))
sqlite3.OperationalError: no such table: Simulations

[2023-07-26 17:02:26,501] ERROR in app: Exception on /simulation/CDFD/f8400cba-86c0-46b8-bba9-3fc25f3dbc2d [GET]
Traceback (most recent call last):
  File "/nvme/h/cy21cs1/miniconda3/envs/windtunnel/lib/python3.7/site-packages/flask/app.py", line 2529, in wsgi_app
    response = self.full_dispatch_request()
  File "/nvme/h/cy21cs1/miniconda3/envs/windtunnel/lib/python3.7/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/nvme/h/cy21cs1/miniconda3/envs/windtunnel/lib/python3.7/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/nvme/h/cy21cs1/miniconda3/envs/windtunnel/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/nvme/h/cy21cs1/wee_archie/framework/server/framework.py", line 113, in get_instance_status
    data['status'] = conn.getInstanceStatus(simid,instanceid)
  File "/nvme/h/cy21cs1/wee_archie/framework/server/frameworkdb.py", line 54, in getInstanceStatus
    cursor.execute('SELECT Status FROM Instances WHERE SIMID=? AND IID=?', (simulationid, instanceid))
sqlite3.OperationalError: no such table: Instances
 - - [26/Jul/2023 17:02:26] "GET /simulation/CDFD/f8400cba-86c0-46b8-bba9-3fc25f3dbc2d HTTP/1.1" 500

Any idea what might be of fault here? Also, could you please provide a bit more information about the simulations.db?

Best Regards, Chris

cstyl commented 1 year ago

I think I was creating the simulations.db without connecting the .sql file to it hence why I was getting the sqlite3.OperationalError: no such table: error. However, at the server side now I end up with the following:

 - - [27/Jul/2023 11:16:39] "POST /simulation/CDFD HTTP/1.1" 201 -
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/nvme/h/cy21cs1/miniconda3/envs/windtunnel/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/nvme/h/cy21cs1/wee_archie/framework/server/simulation_runner.py", line 58, in run
    executablefile = conn.getSimulationExecutable(self.kwargs['simid'])
  File "/nvme/h/cy21cs1/wee_archie/framework/server/frameworkdb.py", line 111, in getSimulationExecutable
    cursor.execute('SELECT ExecutionPrefix, ExecutionHostFile, Executable, ChangeToDir FROM Simulations WHERE SIMID=?', (simulationid, ))
sqlite3.OperationalError: no such column: ChangeToDir

In the .sql file though there is no such column.