Cavenfish / autogamess

This is a python module for automating Raman calculations using GAMESS(us).
MIT License
9 stars 8 forks source link

Question about fill_spreadsheet and continuing JOSS review #11

Closed colleeneb closed 5 years ago

colleeneb commented 5 years ago

This is part of a JOSS review.

Almost done! :) Just a few things left.

Editorial comment:

In “Examples of Common AutoGAMESS Utilization Methods” section in README.md, there are 3 examples. Could these be titled separately (giving each example a heading), for clarity?

General comment:

Since the test inputs were all from the Windows version of GAMESS, I tested if the log parsing worked with the Linux version of GAMESS to see if it could parse the log files and if the inputs it generated ran properly. Everything worked properly, at least for water :) .

Issue:

I have been trying all the functions in README.md. My overall goal was to generate data for Optimization, Hessian, and Raman for the H2O input, and then use fill_spreadsheet to parse the data into the spreadsheet. I ran into an issue with fill_spreadsheet, though (described below).

First, I was able to generate an input with make_input, run it with GAMESS, use opt2hes to generate a hessian input, run it with GAMESS, use hes2raman to get a Raman input, and then run it. I then used new_project to generate a directory structure, and sort_logs to sort the logs into the Logs/Sorted subdirectory. After doing this, I end up with all the logs in the Logs/Sorted/H2O/ subdirectory:

bash-3.2$ ls  Project\ Title/Logs/Sorted/H2O/
AGv1-0-35_H2O_B3LYP_CCD_hes.log
AGv1-0-35_H2O_B3LYP_CCD_opt.log
AGv1-0-35_H2O_B3LYP_CCD_raman.log

However, one thing has tripped me up—after doing what I described above, when I tried to run fill_spreadsheets, it failed with this error:

Traceback (most recent call last):
  File "fill.py", line 11, in <module>
    ag.fill_spreadsheets(sorteddir=sorteddir, sheetsdir=sheetsdir)
  File "", line 174, in fill_spreadsheets
    temp = df[ram]
KeyError: 'Raman'

It looks like the input.csv I used to generate the project structure only had one line, for H2O Optimization, and it looks like to pull out Raman and Hessian information, these need to be in input.csv, too. So I tried to regenerate the spreadsheets and directory structure with the following input.csv

bash-3.2$ cat input.csv 
Species,Theory,Basis Sets,External Basis Sets,Run Types
H2O,B3LYP,CCD,,Raman
H2O,B3LYP,CCD,,Hessian
H2O,B3LYP,CCD,,Optimization

But this input.csv failed to generate a directory structure:

bash-3.2$ python
>>> import autogamess as ag
>>> csvfile = './input.csv'
>>> maindir= './'
>>> title   = 'Project Title/'
>>> ag.new_project(maindir, csvfile, title=title)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/colleen/autogamess/autogamess/new_project.py", line 115, in new_project
    os.makedirs(inputs+runtyp+specie)
  File "/anaconda3/lib/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
FileExistsError: [Errno 17] File exists: './Project Title/Inps/Raman/H2O/'

Any advice on how to proceed? Or maybe I missed something? My use case was to generate the data for Optimization, Hessian, and Raman, for H2O, and then just use fill_spreadsheet to parse the data into the spreadsheet. Maybe this is not a normal use case?

Cavenfish commented 5 years ago

Hey @colleeneb, glad to hear the data parsing worked for Linux! I never tested it with the Linux version. I will make the headings in the README right away.

In terms of the issue you ran into, the CSV file should only have each item on there once, if you take a look at the input.csv in the tests directory it has the input for running water all the way to raman with B3LYP CCD. It should look like this:

Species Theory Basis Sets External Basis Sets Run Types
H2O B3LYP CCD Optimization
Hessian
Raman

AutoGAMESS will generate a project directory assuming every combination of, basis sets (internal and external), theory and run type will be performed for all species. So in the above example it assumes H2O B3LYP CCD will run Optimization, Hessian and Raman.

colleeneb commented 5 years ago

Thanks for the help. Could you mention somewhere in README.md something along the lines of what you said—“the CSV file should only have each item on there once, but could have multiple run types”, and add a few lines in the CSV example in the README about it? Just so other people don’t get tripped up.

Once I changed input.csv as suggested, the spreadsheet was generated as expected with ag.fill_spreadsheets('./Project Title/'). However, it moved the log files from Sorted directory to [project]/Pass/[run_type]/[species]/, which I didn’t expect—could this be documented? I only expected it to fill the spreadsheets, not move files, too.

On that note, trying to run

import autogamess as ag                                                                                                                                                                                         
sorteddir = './Project Title/Logs/Sorted/'                                                                                                                                                                      
sheetsdir = './Project Title/Spreadsheets/'                                                                                                                                                                     
ag.fill_spreadsheets(sorteddir=sorteddir, sheetsdir=sheetsdir) 

Using the same setup as before—log files for H20 for Optimization, Hessian, and Raman in “./Project Title/Logs/Sorted/“, running the code above resulted in the following error:

Traceback (most recent call last):
  File "fill.py", line 11, in <module>
    ag.fill_spreadsheets(sorteddir=sorteddir, sheetsdir=sheetsdir)
  File "/Users/colleen/autogamess/autogamess/fill_spreadsheets.py", line 203, in fill_spreadsheets
    move2 = passdir + ram + '/' + dir + '/' + file
UnboundLocalError: local variable 'passdir' referenced before assignment

It looks like in fill_spreadsheets.py, passedir is only defined if projdir is True, but it’s used later even if it's not true.

Cavenfish commented 5 years ago

Hey @colleeneb I will add that to the README right away! I will also add that fill_spreadsheets moves the files to the documentation as well. I'm also going to fix the passdir issue, right away.

Again, as always thank you so much for the comments and bug spotting!

colleeneb commented 5 years ago

(One possible suggestion is to put any example that is in the README.md also in the testing setup, so you know those are being tested. However, this is not required for the review/acceptance, just a general comment.)

colleeneb commented 5 years ago

Last thing! (I'm staying in this issue since it's related. I repulled and tried running

import autogamess as ag                                                                                                                                                                                         
sorteddir = './Project Title/Logs/Sorted/'                                                                                                                                                                      
sheetsdir = './Project Title/Spreadsheets/'                                                                                                                                                                     
ag.fill_spreadsheets(sorteddir=sorteddir, sheetsdir=sheetsdir) 

again, and now I get the error below. So passdir or something else is missing a / (it seems to be looking for './Project TitleLogs/Pass/Raman.)

Traceback (most recent call last):
  File "fill.py", line 9, in <module>
    ag.fill_spreadsheets(sorteddir=sorteddir, sheetsdir=sheetsdir)
  File "/Users/colleen/autogamess/autogamess/fill_spreadsheets.py", line 213, in fill_spreadsheets
    os.rename(filename, move2)
FileNotFoundError: [Errno 2] No such file or directory: './Project Title/Logs/Sorted/H2O/AGv1-0-35_H2O_B3LYP_CCD_raman.log' -> './Project TitleLogs/Pass/Raman/H2O/AGv1-0-35_H2O_B3LYP_CCD_raman.log'
Cavenfish commented 5 years ago

@colleeneb I just fixed it (at least I think i did lol). Let me know if its working now.

colleeneb commented 5 years ago

Yes, that worked for me, thanks!

colleeneb commented 5 years ago

@Cavenfish This is outside of the scope of the review, but I wanted to mention that if you are interested in making changes/improvements to GAMESS itself, you can follow the instructions here, and email to request access to the GAMESS private repo on github. (I moved this from the review page to here, since I think it's more appropriate here than in the review page. Sorry for spamming you twice.)

Cavenfish commented 5 years ago

@colleeneb Not a problem, twice isn't too bad! I will look into it, thank you for passing along this information.