rich-iannone / splitr

Use the HYSPLIT model from inside R and do more with it
Other
141 stars 60 forks source link

OSX: zip warning: name not matched leads to error #3

Closed jdossgollin closed 8 years ago

jdossgollin commented 8 years ago

Am running SplitR on OSX 10.10.5 using the most recent version of Hysplit4 for trajectory analysis. If I run the example code from the readme document, SplitR runs but gives the following error:

zip warning: name not matched.

All appropriate meteorological files are downloaded and files are produced which look a lot like the .txt files that I obtain using the online version of Hysplit.

Attached: slightly modified version of example code (I get the same error for different types of meteorological data, for using just 'day' runs, etc).

My code:

library(SplitR)

rm(list = ls())

hs.path <- '/Users/james/Hysplit4/'
trajectory_df <- hysplit_trajectory(
  traj_name = "nyc-oct-2005",
  return_traj_df = TRUE,
  start_lat_deg = 40.7127,
  start_long_deg = -74.0059,
  start_height_m_AGL = 1000,
  simulation_duration_h = 144,
  backtrajectory = TRUE,
  met_type = "reanalysis",
  vertical_motion_option = 0,
  top_of_model_domain_m = 10000,
  run_type = "range",
  run_range = c("2005-10-09", "2005-10-15"),
  daily_hours_to_start = c("00", "06", "12", "18"),
  path_met_files =  paste0(hs.path, 'met/'),
  path_output_files = paste0(hs.path, 'output_proj/'),
  path_wd = paste0(hs.path, '/working/'),
  path_executable = paste0(hs.path, 'exec/')
)

Sample console output:

 Percent complete:  56.2

...

Percent complete: 100.0
 Complete Hysplit
    zip warning: name not matched: traj-back---05-10-09-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-09-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-09-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-09-18-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-10-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-10-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-10-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-10-18-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-11-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-11-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-11-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-11-18-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-12-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-12-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-12-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-12-18-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-13-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-13-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-13-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-13-18-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-14-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-14-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-14-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-14-18-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-15-00-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-15-06-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-15-12-lat_40-7127_long_-74-0059-height_1000-144h
    zip warning: name not matched: traj-back---05-10-15-18-lat_40-7127_long_-74-0059-height_1000-144h

zip error: Nothing to do! (try: zip -r9X /Users/james/Hysplit4/output_proj/nyc-oct-2005--2015-11-04--09-58-26.zip . -i traj-back---05-10-09-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-09-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-09-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-09-18-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-10-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-10-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-10-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-10-18-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-11-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-11-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-11-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-11-18-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-12-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-12-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-12-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-12-18-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-13-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-13-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-13-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-13-18-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-14-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-14-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-14-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-14-18-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-15-00-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-15-06-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-15-12-lat_40-7127_long_-74-0059-height_1000-144h traj-back---05-10-15-18-lat_40-7127_long_-74-0059-height_1000-144h)
unzip:  cannot find or open /Users/james/Hysplit4/output_proj/nyc-oct-2005--2015-11-04--09-58-26.zip, /Users/james/Hysplit4/output_proj/nyc-oct-2005--2015-11-04--09-58-26.zip.zip or /Users/james/Hysplit4/output_proj/nyc-oct-2005--2015-11-04--09-58-26.zip.ZIP.
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file '/var/folders/_b/vhmll2t955x7dmrpml0wq3240000gn/T//RtmpqP3WkX/NA': No such file or directory
rich-iannone commented 8 years ago

@jdossgollin Sorry I didn't answer this but this shouldn't be a problem anymore. Got away from archiving end points files with zip.

jdossgollin commented 8 years ago

@rich-iannone thanks! However, I'm getting a funny error when I try to run the example from ?hysplit_trajectory:

a <- hysplit_trajectory(
  traj_name = "james",
  return_traj_df = TRUE,
  start_lat_deg = 50.108,
  start_long_deg = -122.942,
  start_height_m_AGL = 200.0,
  simulation_duration_h = 96,
  backtrajectory = TRUE,
  met_type = "reanalysis",
  vertical_motion_option = 0,
  top_of_model_domain_m = 20000,
  run_type = "years",
  run_years = "2004",
  daily_hours_to_start = c("03", "06", "09", "12",
                           "15", "18", "21"),
  path_met_files = hs.paths$met.files,
  path_output_files = hs.paths$output.files,
  path_wd = hs.paths$working,
  path_executable = hs.paths$executable
)

(where hs.paths is a list I have with the file paths, it's the same if I just copy and paste them in) gives me

Error in hysplit_trajectory(traj_name = "james", return_traj_df = TRUE,  : 
  unused arguments (path_met_files = hs.paths$met.files, path_output_files = hs.paths$output.files,  path_wd = hs.paths$working, path_executable = hs.paths$executable)
jdossgollin commented 8 years ago

FWIW this is after purging SplitR from my system and re-installing

jdossgollin commented 8 years ago

Also, this is the same if I change the return_traj_df or backtrajectory options

rich-iannone commented 8 years ago

James, set a working directory with setwd() (where you'd like the met files and results) and then try this:

a <- hysplit_trajectory(
  traj_name = "james",
  return_traj_df = TRUE,
  start_lat_deg = 50.108,
  start_long_deg = -122.942,
  start_height_m_AGL = 200.0,
  simulation_duration_h = 96,
  backtrajectory = TRUE,
  met_type = "reanalysis",
  vertical_motion_option = 0,
  top_of_model_domain_m = 20000,
  run_type = "years",
  run_years = "2004",
  daily_hours_to_start = c("03", "06", "09", "12",
                           "15", "18", "21")
)

I've tried to simplify the function by not requiring any paths. This makes everything a bit more R-like. Because you're using OS X, the function will use the necessary HYSPLIT binaries available in the package itself. If you already have the met files, move them to your chosen R working directory.

jdossgollin commented 8 years ago

I think that's a great change -- does that mean I can delete the Hysplit folder I downloaded from NOAA, or do the binaries call it?

Anyways, running the example you gave me, I get a whole bunch of messages like this

Calculation Started ... please be patient
 *ERROR* sfcinp: ASCDATA.CFG file not found!
 See MESSAGE file for more information  
 HYSPLIT4 - Initialization
 Last Changed Rev: 515
 Last Changed Date: 2013-08-06 12:06:35 -0400 (Tue, 06 Aug 2013)

and then (I think this is R)

Error in read.table(file = FILE, header = header, sep = sep, row.names = row.names,  : no lines available in input

Something's definitely going on there, though, because

the contents of these files looks like

 3     1
CDC1     3    12     1     0     0
CDC1     4     1     1     0     0
CDC1     4     2     1     0     0
 1 BACKWARD OMEGA   
 4     1     1     6   50.108 -122.942   200.0
 1 PRESSURE

there's also a bunch of system files (ie TRAJ.CFG but no ASCDATA.CFG)

rich-iannone commented 8 years ago

That's strange because hysplit_trajectory() places ASCDATA.CFG in the working directory before anything else is done. As a workaround, pull that file out of the Hysplit installation and into your working directory. It should contain the lines:

-90.0  -180.0  lat/lon of lower left corner (last record in file)
1.0  1.0    lat/lon spacing in degrees between data points
180  360    lat/lon number of data points
2   default land use category
0.2     default roughness length (meters)
'.'  directory location of data files
jdossgollin commented 8 years ago

That's working (although I switched to run_range because the year was running slow on my laptop -- should be better on the cluster). Thanks!

Here's what my ASCDATA.CFG looks like:

-90.0  -180.0  lat/lon of lower left corner (last record in file)
1.0     1.0    lat/lon spacing in degrees between data points
180     360    lat/lon number of data points
2              default land use category
0.2            default roughness length (meters)
'../bdyfiles/' directory location of data files

Thx again for developing this software, very glad I don't have to mess around with FORTRAN

rich-iannone commented 8 years ago

@jdossgollin Great! And you're welcome! I'll try to make things run better/faster by modernizing some of the components (e.g., using the downloader, readr, and dplyr package functions). Plus: a focus on output graphics.

jdossgollin commented 8 years ago

I had good luck plotting using ggmap although the visualizations were a little heavy (running a lot of particles, though)

rich-iannone commented 8 years ago

@jdossgollin Great! A plan of mine is to create a function to plot trajectories with the leaflet package. It can handle all sorts of graphics overlaid on a base map. Plus: tooltips with more information.

rich-iannone commented 8 years ago

By the way, I think I've fixed the issue with the ASCDATA.CFG file not being available. I'll push that to the repo tonight.

jdossgollin commented 8 years ago

I've heard good things about leaflet -- will definitely take a look. In terms of ensemble runs, I built a function with data.table and foreach -- you may not want to make people install those packages (they're pretty go-to for me)

hysplit_ensemble <- function(lats, lons, s_dates, s_levels, opts){
  # lats: all latitudes from which to initialize points
  # lons: all longitudes from which to initialize points
  # s_dates: vector (date format) of all dates from which to center trajectories
  # s_levels: vector (numeric) of all levels (m) to seed from

  # make sure required packages are installed
  require(SplitR) # needs to be most recent version! devtools::install_github
  require(data.table) # blazing fast for large data sets
  require(magrittr) # for the pipe operator %>%
  require(foreach) # combine data efficiently, can be run in parallel w/ minor tweaks
  require(doParallel) # required complement to foreach package

  s_points <- expand.grid(lats = lats, lons = lons, s_dates = s_dates, s_levels = s_levels) %>% data.table()

  multi_runs <- foreach(i = 1:nrow(s_points), .combine = 'rbind') %do% {
    s_lat <- s_points[i, lats]
    s_lon <- s_points[i, lons]
    s_date <- s_points[i, s_dates]
    s_level <- s_points[i, s_levels]

    # get the run for a single particle at a given unique combination of lat, lon, and date
    single_run <- hysplit_trajectory(
      traj_name = paste0(s_date, '-', s_lat, '-', s_lon),
      return_traj_df = TRUE,
      start_lat_deg = s_lat,
      start_long_deg = s_lon,
      start_height_m_AGL = s_level,
      simulation_duration_h = opts$simulation_duration_h,
      backtrajectory = opts$backtrajectory,
      met_type = opts$met_type,
      vertical_motion_option = opts$vertical_motion_option,
      top_of_model_domain_m = opts$top_of_model_domain_m,
      run_type = opts$run_type,
      run_range = c(s_date - opts$days_back, s_date + opts$days_fwd), 
      daily_hours_to_start = opts$daily_hours_to_start
    ) %>% data.table()

    # add identifying information
    single_run[, ':='(s_lat = s_lat, s_lon = s_lon, s_date = s_date)]

    return(single_run)
  }

  return(multi_runs)
}
jdossgollin commented 8 years ago

Also, sorry to drag this convo out, but did the ability to track specific humidity (spchumid I think) get dropped?

rich-iannone commented 8 years ago

@jdossgollin sorry for the delay but could you elaborate? Do you mean outputting meteorological params at every step? I updated the Hysplit binaries a little after your last comment (now current as of mid-2015, 2013 binaries previously). Not sure if that has anything to do with the issue.

jdossgollin commented 8 years ago

If I run the example but with reanalysis data

trajectory_df <- 
  hysplit_trajectory(
    traj_name = "t2",
    return_traj_df = TRUE,
    start_lat_deg = 42.83752,
    start_long_deg = -80.30364,
    start_height_m_AGL = 5,
    simulation_duration_h = 24,
    backtrajectory = FALSE,
    met_type = "reanalysis",
    vertical_motion_option = 0,
    top_of_model_domain_m = 20000,
    run_type = "day",
    run_day = "2012-03-12",
    daily_hours_to_start = c("00", "06", "12", "18")) 

I get

  receptor year month day hour hour.inc    lat     lon height pressure               date2       date
1        1   12     3  12    0        0 42.838 -80.304    5.0    989.8 2012-03-12 00:00:00 2012-03-12
2        1   12     3  12    1        1 42.983 -80.120    4.8    989.7 2012-03-12 01:00:00 2012-03-12
3        1   12     3  12    2        2 43.125 -79.957    4.7    989.6 2012-03-12 02:00:00 2012-03-12
4        1   12     3  12    3        3 43.262 -79.819    4.5    989.7 2012-03-12 03:00:00 2012-03-12
5        1   12     3  12    4        4 43.396 -79.705    4.3    989.9 2012-03-12 04:00:00 2012-03-12
6        1   12     3  12    5        5 43.525 -79.614    4.1    990.2 2012-03-12 05:00:00 2012-03-12

Previously there were other columns AIR_TEMP PRESSURE RAINFALL MIXDEPTH RELHUMID H2OMIXRA SPCHUMID SUN_FLUX TERR_MSL THETA -- is there a way to keep those in the traj_df?

jdossgollin commented 8 years ago

Also the ASCDATA.CFG is working now, good fix

jdossgollin commented 8 years ago

I'm fairly certain that the way around this problem is by setting some of the 0 to 1 in the TRAJ.CFG file but I'm not sure how to do that using the SplitR binaries

rich-iannone commented 8 years ago

@jdossgollin worked on this a bit and now it's a default behaviour to include all of the extra met along the trajectories (in the output files and in the returned data frame)

jdossgollin commented 8 years ago

That's awesome. At the moment my TRAJ.CFG looks like

 &SETUP
 tratio = 0.75,
 delt = 0.0,
 mgmin = 10,
 khmax = 9999,
 kmixd = 0,
 kmsl = 0,
 kagl = 1,
 k10m = 1,
 nstr = 0,
 mhrs = 9999,
 nver = 0,
 tout = 60,
 tm_pres = 1,
 tm_tpot = 1,
 tm_tamb = 1,
 tm_rain = 1,
 tm_mixd = 1,
 tm_relh = 1,
 tm_sphu = 0,
 tm_mixr = 0,
 tm_dswf = 1,
 tm_terr = 1,
 dxf = 1.00,
 dyf = 1.00,
 dzf = 0.01,
 messg = 'MESSAGE',
 /

where the only guy that's missing is the specific humidity tm_sphu. If it's really trivial, it would be great to get that included -- if not, it's pretty easy for me to calculate from everything else, would just slow my script down a small amount. Thanks!

rich-iannone commented 8 years ago

@jdossgollin it's now fixed and all the met params are provided.

jdossgollin commented 8 years ago

@rich-iannone it seems to have reverted -- i tried deleting all the files in the directory and it's back to how it was before with none of the params.

rich-iannone commented 8 years ago

@jdossgollin sorry should have mentioned this but use return_met_along_traj = TRUE in the function call. By default it is set to FALSE.

jdossgollin commented 8 years ago

got it, works like a charm thanks

rich-iannone commented 8 years ago

Great! Check out the trajectory_plot function as well...

jdossgollin commented 8 years ago

I like the interactive visualization -- that's definitely where graphics are headed.