Closed jourdain closed 7 years ago
I guess that could be a good start
Running that new vagrant from scratch to make sure it works without the hpccloud
user.
Hum got...
TASK [pyfr : Install pip3] *****************************************************
task path: /Users/seb/Documents/code/HPCCloud/HPCCloud-deploy/demo/roles/pyfr/tasks/main.yml:1
failed: [hpccloud-compute-node-vm] (item=[u'python3-pip']) => {"failed": true, "item": ["python3-pip"], "msg": "Failed to lock apt for exclusive operation"}
that role seems missing the become: yes
everywhere. Should add it @cjh1 ?
yes please
ok thanks... I'll look around and add become: yes
when I see become_user: root
in any role within ./demo/roles.
@cjh1 can you look at my last commit. I'm kind of worried that I had to change it for lots of location. I just want to make sure that change actually make sense to you... thx
Actually the pyfr role come from your repo here https://github.com/cjh1/pyfr-ansible-role/blob/master/tasks/main.yml
They look good, I will update the pyfr role
Why don't we use the pyfr role that is in the ansible directory? I just notice that we are missing 'pycuda' and the numpy version is different.
The pyfr role in the ansible directly came first, it only installs enough for pyfr to be able to partition the mesh. The role I created installs the runtime as well. We can probably just use the one I created as it is more complete.
I'm open to any solution, I thought we should try to rely on one code that are in the same repo if we can. The difference between the two files are really small.
Seems to be working... No more error at deployment. Trying to use it as compute node now.
The advantage of keeping it in a separate repo is it can be used in multiple playbooks, through ansible galaxy.
Hum wondering if something wrong happened to my girder:
[10:34:09.698] ERROR: Exception raise by task.
File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "cumulus/taskflow/__init__.py", line 117, in wrapped
return func(celery_task, *args, **kwargs)
File "/opt/hpccloud/hpccloud/server/taskflows/hpccloud/taskflow/openfoam/tutorial.py", line 201, in upload_output
girder_token=task.taskflow.girder_token)
File "cumulus/tasks/job.py", line 774, in upload_job_output_to_folder
assetstore_id = get_assetstore_id(girder_token, cluster)
File "cumulus/transport/files/__init__.py", line 54, in get_assetstore_id
check_status(r)
File "cumulus/common/__init__.py", line 32, in check_status
request.raise_for_status()
File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 844, in raise_for_status
raise HTTPError(http_error_msg, response=self)
HTTPError: 400 Client Error: Bad Request for url: http://10.160.1.108:8080/api/v1/sftp_assetstores
Going to swagger do not show any GET endpoint. Only POST?
I should try to pull your latest cumulus too...
This seems fine now do you mind reviewing it? @cjh1
WIP