Kitware / HPCCloud

A Cloud/Web-Based Simulation Environment
https://kitware.github.io/HPCCloud/
Apache License 2.0
50 stars 23 forks source link

WIP: Added json conversion of NWChem to workflow #557

Open chetnieter opened 7 years ago

chetnieter commented 7 years ago

Changes needed to add the conversion of the ascii output of NWChem to json format so it can be parsed to visualization with WebGL.

chetnieter commented 7 years ago

@cjh1 So this is basically working. I have added to the NWChem taskflow and it nows runs the json conversion script, generates the json file and uploads it to Girder along with the rest of the files. There are a couple of issues I think need to be worked out but they could wait until after we have added the ability to do visualization with WebGL. Here is a summary of them.

Do you want me to address any of these issues or should we move on to getting the visualization working? I am thinking we may want to merge this work once we are happy with it since I am guessing the visualization will be more involved.

TristanWright commented 7 years ago

Can we keep TODO's out of code and file them as issues or items on issues even if temporary

codecov-io commented 7 years ago

Current coverage is 61.80% (diff: 100%)

Merging #557 into master will not change coverage

@@             master       #557   diff @@
==========================================
  Files            59         59          
  Lines          2788       2788          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits           1723       1723          
  Misses         1065       1065          
  Partials          0          0          

Powered by Codecov. Last update 99ea1e8...a7bd2df

cjh1 commented 7 years ago

On Wed, Dec 7, 2016 at 3:51 PM, Chet Nieter notifications@github.com wrote:

@cjh1 https://github.com/cjh1 So this is basically working. I have added to the NWChem taskflow and it nows runs the json conversion script, generates the json file and uploads it to Girder along with the rest of the files. There are a couple of issues I think need to be worked out but they could wait until after we have added the ability to do visualization with WebGL. Here is a summary of them.

  • Since the relevant data is in the standard output of NWChem I made some assumptions about the filename that holds the output based on the results on ulmus. This is dependent on the job scheduler so right now it may only work for SGE.

We should resolve this, can you just add a redirect in the submission script. So something like nwchem input &> nwchem_output?

  • I feel like we might want to rename the json file after it is generated since the conversion script just appends '.json' the output filename.

Sounds like a good idea

  • Right now the location of the conversion script is hard coded for its location on ulmus. This should probably be part of the cluster configuration. We could consider doing the conversion on the server by including the conversion module but that would require copying the output file from the cluster to the server.

Is the conversion script just a single file? Could we just download it as part of the taskflow, if it doesn't already exist? Do we know what the license is? May be we could check it into our repo ....

  • I added the json conversion to the upload_output task. Do we want it to live in its own task?

I think that probably make sense.

Do you want me to address any of these issues or should we move on to getting the visualization working? I am thinking we may want to merge this work once we are happy with it since I am guessing the visualization will be more involved.

I would say let resolve these issue before moving forward.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Kitware/HPCCloud/pull/557#issuecomment-265569543, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-7jVj03Ku-L_xiSBcT-Y30IwsRe87Nks5rFxxQgaJpZM4LHBRC .

chetnieter commented 7 years ago

Can we keep TODO's out of code and file them as issues or items on issues even if temporary

TODO's removed, at least from the file I touched. List of issues now lives at #558

chetnieter commented 7 years ago

Is the conversion script just a single file? Could we just download it as part of the taskflow, if it doesn't already exist? Do we know what the license is? May be we could check it into our repo ....

It is two files. One holds a python module that does all the work and the other is just a simple python script the calls the module.

cjh1 commented 7 years ago

@chetnieter One thing to keep in mind here is that we are going to have to run this JSON through another tool (Avogadro ) ( I will get the details ) which is a C++ application. We might want to look at how we could host some of these environments in a docker container which is what girder_worker seems todo

cjh1 commented 7 years ago

We can reuse parts of this, calculate_mo(...) is the function we want. @cryos Does avogadrolibs master support python wrapping?

cryos commented 7 years ago

@cjh1 yes, the way it is configured is documented in the ansible code we developed in the Phase I DOE project, mongochemdeploy. I want to move it to use PyBind11 soon, but you should be able to use this for now. It even has the workaround I used to find the right Python 3 on an Ubuntu host.

chetnieter commented 7 years ago

One thing to keep in mind here is that we are going to have to run this JSON through another tool (Avogadro ) ( I will get the details ) which is a C++ application. We might want to look at how we could host some of these environments in a docker container which is what girder_worker seems todo

I am new Docker so I have some questions:

I can start working on a Dockerfile that creates an image with the json conversion script.

cjh1 commented 7 years ago

I am new Docker so I have some questions:

  • I assume that we would run the Docker container on the server.

Yes, when we say server we mean the machine or machines hosting the cumulus stack

  • This would mean we would have to copy the data from the cluster to the server.

Are we not already pull it off the cluster to upload it into GIrder. When is the conversion to JSON currently taking place?

  • Would we be pulling the image from Dockerhub?

Yes, when we have something working

  • I assume mounting a the server with the nwchem data would be the best way to give the container access to the data?

I would think just downloading in the container?

I can start working on a Dockerfile that creates an image with the json conversion script.

Sure, if you would like to experiment with this. Otherwise, we can punt on using a container and continue to install things directly on the server to get the end to end use case work i.e. build avogdralibs on the server.