Open rburghol opened 2 years ago
Url to already run .h5 file: http://deq1.bse.vt.edu:81/files/cbp/OR1_7700_7980.h5
First, navigating to directory:
cd cbp_river_example
Downloading the file in Terminal:
wget -N http://deq1.bse.vt.edu:81/files/cbp/OR1_7700_7980.h5
Note: the '-N' tells system to overwrite if incoming file is newer
Note: do not use the wget -rN http://deq1.bse.vt.edu:81/files/cbp/OR1_7700_7980.h5
-rN
, use only -N
because the r
causes the folder structure to be copied as well (like files/cbp/OR_7700_7980.h5) and we don't want that.
Previewing group names:
h5ls("OR1_7700_7980.h5", recursive = FALSE)
Listing with a true recursive argument will give a much longer output but include detailed file paths for further exploration:
h5ls("OR1_7700_7980.h5")
For example: "/TIMESERIES/TS1001/_i_table/index"
h5ls("file", recursive = TRUE, all = FALSE, datasetinfo = TRUE, index_type, native = FALSE)
h5read("file", name, index = NULL, start = NULL, stride = NULL, block = NULL, count = NULL, compoundAsDataFrame = TRUE, callGeneric = TRUE, read.attributes = FALSE, drop = FALSE, native = FALSE, s3 = FALSE, s3credentials = NULL)
h5ls:
h5read:
read1 <- h5read("OR1_7700_7980.h5", name = "/TIMESERIES/TS1001", index = 10) - - Error in H5Dread... Not enough memory to read data! Try to read a subset of data by specifying the index or count parameter. Error: Error in h5checktype(). H5Identifier not valid.
h5read redirects through its arguments to h5Dread, which reads a partial dataset from an HGF5 file. Could this help with the error above?
h5Dread:
h5Dread(h5dataset, h5spaceFile = NULL, h5spaceMem = NULL, buf = NULL, compoundAsDataFrame = TRUE, bit64conversion, drop = FALSE
@juliabruneau - good information/hunting. Perhaps this is due to a partial set? Though, I think that the memory error comes from an unspecific query. For example, if I keep knocking things off the end of the h5read path like so I get more and more warnings, presumably as the data retrieved gets larger, I just think maybe this error could mean there is too much data?:
h5dump
and grep search for "TIMESERIES/TS1001", i.e. h5dump -d OR1_7700_7980.h5 | grep TS1001
, I get an inkling of what the contents of that timeseries may be, and how to get at it, and wow are there a lot of components in that table. See below Terminal Code 1. It makes me think that I should try something more specific, by adding /table
on the end of the path, I get actual data and no errors (but some warnings appear)
Terminal Code 1: Output of h5dump -d OR1_7700_7980.h5 | grep TS1001
group /TIMESERIES/TS1001
group /TIMESERIES/TS1001/_i_table
group /TIMESERIES/TS1001/_i_table/index
dataset /TIMESERIES/TS1001/_i_table/index/abounds
dataset /TIMESERIES/TS1001/_i_table/index/bounds
dataset /TIMESERIES/TS1001/_i_table/index/indices
dataset /TIMESERIES/TS1001/_i_table/index/indicesLR
dataset /TIMESERIES/TS1001/_i_table/index/mbounds
dataset /TIMESERIES/TS1001/_i_table/index/mranges
dataset /TIMESERIES/TS1001/_i_table/index/ranges
dataset /TIMESERIES/TS1001/_i_table/index/sorted
dataset /TIMESERIES/TS1001/_i_table/index/sortedLR
dataset /TIMESERIES/TS1001/_i_table/index/zbounds
group /TIMESERIES/TS1001/_i_table/values
dataset /TIMESERIES/TS1001/_i_table/values/abounds
dataset /TIMESERIES/TS1001/_i_table/values/bounds
dataset /TIMESERIES/TS1001/_i_table/values/indices
dataset /TIMESERIES/TS1001/_i_table/values/indicesLR
dataset /TIMESERIES/TS1001/_i_table/values/mbounds
dataset /TIMESERIES/TS1001/_i_table/values/mranges
dataset /TIMESERIES/TS1001/_i_table/values/ranges
dataset /TIMESERIES/TS1001/_i_table/values/sorted
dataset /TIMESERIES/TS1001/_i_table/values/sortedLR
dataset /TIMESERIES/TS1001/_i_table/values/zbounds
dataset /TIMESERIES/TS1001/table
Hey all - see below which is excerpted from the test cases that we worked on yesterday (see also #211). This one gets us data that we want and gives clues as to where to look for other data (hint: maybe not TIMESERIES)
rchres_data = h5read("OR1_7700_7980.h5", "/RESULTS/RCHRES_R001/HYDR/table")
names(rchres_data )
quantile(rchres_data$ROVOL)
as.POSIXct(441766800000000000,origin="1970-01-01", tz="UTC")
to convert Unix seconds into a readable time, I get: [1] NA
as.POSIXct(44176680000000000,origin="1970-01-01", tz="UTC")
= "1399905230-08-14 16:00:00 UTC"
... a seriously post-StarTrek date,but at least it's a date.as.POSIXct(4417668000000,origin="1970-01-01", tz="UTC")
[1] "141960-04-29 16:00:00 UTC"
as.POSIXct(441766800,origin="1970-01-01", tz="UTC")
[1] "1984-01-01 01:00:00 UTC"
We can explore the .h5 files with an application called HDFView . It is used to specifically open .hdf5/.h5 files, and it provides a directory to look into the different groups and attributes within the .h5 file. The only "limitation" is that you have to register to this website in order to download the application, but it only asks for your email and what organization you're apart of (academic research).
This is the process to access the files in HDFView:
Download the application: https://www.hdfgroup.org/downloads/hdfview/?1656346198
Download the .h5 file: http://deq1.bse.vt.edu:81/files/cbp/OR1_7700_7980.h5
Click on the downloaded .h5 file to open it (this will open it automatically in the HDFViewer)
Now you are able to see all the groups and different "layers" in our .h5 file
This Viewer provides more understanding on the contents of a hdf5 file, and it can hopefully help understand how we can extract the timestamp using R. Maybe we can utilize the Viewer's function to extract .txt files?
Update: Can save table as a .txt file to computer. Working on putting into R.
Using H5Dread to get 64-bit timestamps:
fid = H5Fopen("OR1_7700_7980.h5")
did = H5Dopen(fid, "RESULTS/RCHRES_R001/HYDR/table")
H5Dread(did, bit64conversion= "double")
index DEP IVOL O1 O2 O3 OVOL1 OVOL2 OVOL3
1 4.417668e+17 0.2415072 8.847264 0 0 2.055183 0 0 0.1229398
2 4.417704e+17 0.3002159 8.828485 0 0 3.175832 0 0 0.2161576
3 4.417740e+17 0.3486096 8.810900 0 0 4.282218 0 0 0.3081839
4 4.417776e+17 0.3905506 8.793962 0 0 5.374578 0 0 0.3990412
5 4.417812e+17 0.4279465 8.777404 0 0 6.453111 0 0 0.4887475
6 4.417848e+17 0.4619085 8.761090 0 0 7.517995 0 0 0.5773184
7 4.417884e+17 0.4931512 8.744946 0 0 8.569401 0 0 0.6647684
8 4.417920e+17 0.5221504 8.728931 0 0 9.606858 0 0 0.7510851
9 4.417956e+17 0.5492472 8.713009 0 0 10.629819 0 0 0.8362263
10 4.417992e+17 0.5747209 8.697166 0 0 11.638691 0 0 0.9201863
Don't forget to close the open data objects, both the file and dataset, when finished
H5Dclose(did)
H5Fclose(fid)
rchres1 <- H5Dread(did, bit64conversion= "double")
head(rchres1)
index DEP IVOL O1 O2 O3 OVOL1 OVOL2 OVOL3 PRSUPY
1 4.417668e+17 0.2415072 8.847264 0 0 2.055183 0 0 0.1229398 0
2 4.417704e+17 0.3002159 8.828485 0 0 3.175832 0 0 0.2161576 0
3 4.417740e+17 0.3486096 8.810900 0 0 4.282218 0 0 0.3081839 0
4 4.417776e+17 0.3905506 8.793962 0 0 5.374578 0 0 0.3990412 0
5 4.417812e+17 0.4279465 8.777404 0 0 6.453111 0 0 0.4887475 0
6 4.417848e+17 0.4619085 8.761090 0 0 7.517995 0 0 0.5773184 0
RO ROVOL SAREA TAU USTAR VOL VOLEV
1 2.055183 0.1229398 67.47366 0.02344255 0.1099862 15.79433 0
2 3.175832 0.2161576 83.87602 0.02914127 0.1226281 24.40666 0
3 4.282218 0.3081839 97.39652 0.03383873 0.1321425 32.90938 0
4 5.374578 0.3990412 109.11423 0.03790982 0.1398658 41.30430 0
5 6.453111 0.4887475 119.56212 0.04153978 0.1464090 49.59296 0
6 7.517995 0.5773184 129.05061 0.04483640 0.1521076 57.77673 0
origin <- "1970-01-01" rchres1$index <- as.POSIXct((rchres1$index)/1000000000, origin = origin, tz="UTC")
head(rchres1) index DEP IVOL O1 O2 O3 OVOL1 OVOL2 OVOL3 1 1984-01-01 01:00:00 0.2415072 8.847264 0 0 2.055183 0 0 0.1229398 2 1984-01-01 02:00:00 0.3002159 8.828485 0 0 3.175832 0 0 0.2161576 3 1984-01-01 03:00:00 0.3486096 8.810900 0 0 4.282218 0 0 0.3081839 4 1984-01-01 04:00:00 0.3905506 8.793962 0 0 5.374578 0 0 0.3990412 5 1984-01-01 05:00:00 0.4279465 8.777404 0 0 6.453111 0 0 0.4887475 6 1984-01-01 06:00:00 0.4619085 8.761090 0 0 7.517995 0 0 0.5773184 PRSUPY RO ROVOL SAREA TAU USTAR VOL VOLEV 1 0 2.055183 0.1229398 67.47366 0.02344255 0.1099862 15.79433 0 2 0 3.175832 0.2161576 83.87602 0.02914127 0.1226281 24.40666 0 3 0 4.282218 0.3081839 97.39652 0.03383873 0.1321425 32.90938 0 4 0 5.374578 0.3990412 109.11423 0.03790982 0.1398658 41.30430 0 5 0 6.453111 0.4887475 119.56212 0.04153978 0.1464090 49.59296 0 6 0 7.517995 0.5773184 129.05061 0.04483640 0.1521076 57.77673 0
Note: We found that this table's last timestamp is 1984-09-02 02:00:00
HDFView 3.1.4
We can explore the .h5 files with an application called HDFView . It is used to specifically open .hdf5/.h5 files, and it provides a directory to look into the different groups and attributes within the .h5 file. The only "limitation" is that you have to register to this website in order to download the application, but it only asks for your email and what organization you're apart of (academic research).
This is the process to access the files in HDFView:
- Download the application: https://www.hdfgroup.org/downloads/hdfview/?1656346198
- Download the .zip file: 'HDFView-3.1.4-win10_64-vs16.zip'
- Extract the .zip file with something like 7-zip
- Download the .h5 file: http://deq1.bse.vt.edu:81/files/cbp/OR1_7700_7980.h5
- Do this by right-clicking on the link, and then choosing: 'Save link as...'
- If download discards, click up arrow and hit keep
- Click on the downloaded .h5 file to open it (this will open it automatically in the HDFViewer)
- Now you are able to see all the groups and different "layers" in our .h5 file
- Ultimately, you're able to click on 'Show Data with Options', which will provide another window with a table, and you can extract the table as a text file (shown below)
This Viewer provides more understanding on the contents of a hdf5 file, and it can hopefully help understand how we can extract the timestamp using R. Maybe we can utilize the Viewer's function to extract .txt files?
Update: Can save table as a .txt file to computer. Working on putting into R.
Attempting to View Output .h5 Files in the HDFView Program:
- After running the land test case (https://github.com/HARPgroup/HARParchive/issues/211), I thought it would be easier to explore and compare the river and land model outputs if we were to open them in the viewer.
- It would allow us to view the groups, subgroups, and data tables by clicking on them as folders rather than repeatedly running commands.
- However, to open them in the viewer they need to be downloaded to your local computer. Since we do not have a link on Github as we did for the first .h5 file, this became an issue.
- *Command to copy files from server to local computer (found online):
scp user@server:/path/to/remotefile.zip /Local/Target/Destination
- *For multiple files at once:
scp user@host:/remote/path/\{file1.zip,file2.zip\} /Local/Path/
- With this, I struggled with how to reference my local computer because calling the local disk "C:" means nothing to the server
- These are the main things I tried:
- `scp megpritch@deq2:~/OR1_7700_7980.h5 ~/Desktop/`
- recieved: _/home/megpritch/Desktop/: Is a directory_
- Not sure what to do with this information because it doesn't seem to be an error
- ` scp megpritch@deq2:~/\{OR1_7700_7980.h5,forA51800.h5\} ip_address/Desktop/folder_name`
- recieved: _No such file or directory_
- `/home/megpritch/OR1_7700_7980.h5 ~\folder_name\OR1_7700_7980.h5`
- This said that it downloaded successfully but then I couldn't find it on my computer. Turns out it made a copy of the file inside my deq account's home directory but renamed it "folder_nameOR1_7700_7980.h5"
What Now?
Still be useful since if we simply want to use it to help us understand structure better, we only really need to have one H5 file for Rivers, and another H5 file for land as a template. That is because each land H5 file will share an identical structures to each other land H5 file. Similarly the same will apply for river h5s. I like that you tried scp, and seems like you got close. We can find a more efficient solution tomorrow.
Overview
hdf5
is a file-based database used in scientific applications, including hsp2.Installation