NOAA-OWP / t-route

Tree based hydrologic and hydraulic routing
Other
40 stars 45 forks source link

`split_csv_file`'s shell call to `awk` is not compliant with Mac OSX's distribution of `awk` #720

Closed aaraney closed 5 months ago

aaraney commented 5 months ago

Running the following awk on OSX results in the subsequent error. As a result, running t-route on Mac OSX with ngen csv output is not possible.

awk -F ', ' '{print "114085, "$NF >> "test/outputfile_"$1".txt"}' nex-114085_output.csv

awk: syntax error at source line 1
 context is
    { print "1000000099, (NF) >> >>>  "test/outputfile_"$ <<< 1".txt" }
awk: illegal statement at source line 1

https://github.com/NOAA-OWP/t-route/blob/4ef96b4ac363bde6cdefd9b614ff0d3e007a54c8/src/troute-network/troute/AbstractNetwork.py#L887

I am not sure off the top of my head how to fix this, but I will do a little digging.

aaraney commented 5 months ago

Having done a little digging, per usual this is a gnu vs I think bsd awk although I couldnt figure out if the awk version mac osx ships with is bsd or apple. In either case, the osx awk binary seems to disallow variable substitution after the output stream directive (>>). Creating a variable before the print command seems to do the trick. Tested on GNU Awk 5.1.0 and osx's awk version 20200816.

awk -F ', ' '{ filename="test/outputfile_"$1".txt"; print "114085, "$NF >> filename }' nex-114085_output.csv
aaraney commented 5 months ago

Just found out that mac osx's default number of allowed open file descriptors per process is 256. So, need to close the files after each append.

awk -F ', ' '{ filename="test/outputfile_"$1".txt"; print "114085, "$NF >> filename; close(filename) }' nex-114085_output.csv

Opening a PR to resolve this issue now.

aaraney commented 5 months ago

Thanks, @shorvath-noaa!