SECOORA / GUTILS

🌊 🚤 Python utilities for reading, merging, and post processing Teledyne Webb Slocum Glider data
MIT License
7 stars 11 forks source link

echometrics improvements #12

Open jr3cermak opened 2 years ago

jr3cermak commented 2 years ago

Continued from PR cf-convention/vocabularies#186

Deferred:

jr3cermak commented 2 years ago

That was one thing I hadn't quite figured out was how to run things within the test harness. I am familiar with pytest. To produce those results, I was manually running gutils_binary_to_ascii_watch and gutils_ascii_to_netcdf_watch.

From email:

I will make a change that allows "extra_kwargs" to be specified in the deployment.json file (top level key) and then passed into each Reader's (i.e. SlocumReader) extras(data, **kwargs) method.

This will be in preparation for moving the processing code from the merge (using .bd files) to the analysis/processing (using ascii/pandas). Doing that will be much easier when we move to using dbdreader, which is a great suggestion! I didn't know it existed and it will make it much easier to work with the .bd files.

The dbdreader has its own quirks.

kwilcox commented 2 years ago

You can run the existing EcoMetrics tests with pytest -k TestEcoMetrics. The tests do remove any of the produced files at the end of running. I often will comment out the tearDown method of a test while I am writing the assertions so I can inspect the produced netCDF files: https://github.com/SECOORA/GUTILS/blob/master/gutils/tests/test_slocum.py#L295-L297

I pushed a branch pseudograms-remix branch that has your initial work from cf-convention/vocabularies#186. You can PR against that!

jr3cermak commented 2 years ago

It seems to be working as is. With tearDown() disabled...

$ pytest -k TestEcoMetricsThree
============================================================================= test session starts ==============================================================================
platform linux -- Python 3.6.15, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /home/cermak/miniconda3/envs/glider/bin/python
cachedir: .pytest_cache
rootdir: /home/cermak/src/GUTILS, configfile: setup.cfg
plugins: anyio-2.2.0
collected 34 items / 33 deselected / 1 selected                                                                                                                                

tests/test_slocum.py::TestEcoMetricsThree::test_pseudogram 2022-03-15 21:14:42,212 - gutils.slocum - INFO - Converted unit_507-2022-042-1-2.sbd,unit_507-2022-042-1-2.tbd to unit_507_2022_042_1_2_sbd.dat
2022-03-15 21:14:42,348 - gutils.filters - INFO - ('Filtered 2/5 profiles from unit_507_2022_042_1_2_sbd.dat', 'Depth (1m): 1', 'Points (5): 1', 'Time (5s): 0', 'Distance (1m): 0')
PASSED

======================================================================= 1 passed, 33 deselected in 5.77s =======================================================================

The test produces three netCDF files. The last one has the desired information. The first two will need empty variables.

~/src/GUTILS/gutils/tests/resources/slocum/ecometrics3/rt$ ls -l netcdf/
total 652
-rw-rw-r-- 1 cermak staff 198712 Mar 15 21:14 ecometrics_1644647093_20220212T062453Z_rt.nc
-rw-rw-r-- 1 cermak staff 209647 Mar 15 21:14 ecometrics_1644647313_20220212T062833Z_rt.nc
-rw-rw-r-- 1 cermak staff 253545 Mar 15 21:14 ecometrics_1644648114_20220212T064154Z_rt.nc
netcdf ecometrics_1644648114_20220212T064154Z_rt {
dimensions:
        time = 20 ;
        extras = 2079 ;
variables:
        string trajectory ;
                trajectory:cf_role = "trajectory_id" ;
                trajectory:long_name = "Trajectory/Deployment Name" ;
                trajectory:comment = "A trajectory is a single deployment of a glider and may span multiple data files." ;
                trajectory:ioos_category = "Identifier" ;
....
        double pseudogram_time(extras) ;
                pseudogram_time:_FillValue = -9999.9 ;
                pseudogram_time:units = "seconds since 1990-01-01 00:00:00Z" ;
                pseudogram_time:calendar = "standard" ;
                pseudogram_time:long_name = "Pseudogram Time" ;
                pseudogram_time:ioos_category = "Other" ;
                pseudogram_time:standard_name = "pseudogram_time" ;
                pseudogram_time:platform = "platform" ;
                pseudogram_time:observation_type = "measured" ;
        double pseudogram_depth(extras) ;
                pseudogram_depth:_FillValue = -9999.9 ;
                pseudogram_depth:units = "m" ;
                pseudogram_depth:long_name = "Pseudogram Depth" ;
                pseudogram_depth:valid_min = 0. ;
                pseudogram_depth:valid_max = 2000. ;
                pseudogram_depth:ioos_category = "Other" ;
                pseudogram_depth:standard_name = "pseudogram_depth" ;
                pseudogram_depth:platform = "platform" ;
                pseudogram_depth:observation_type = "measured" ;
....
sci_echodroid_aggindex = _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, 0.0382824018597603, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
....

Continuing with other tasks... adding information seems straightforward. Do non-standard attributes cause problems? Fiddling with deployment.json and instrument.json a bit:

        double pseudogram_sv(extras) ;
                pseudogram_sv:_FillValue = -9999.9 ;
                pseudogram_sv:units = "db" ;
                pseudogram_sv:long_name = "Pseudogram SV" ;
                pseudogram_sv:colorBarMinimum = -200. ;
                pseudogram_sv:colorBarMaximum = 200. ;
                pseudogram_sv:ioos_category = "Other" ;
                pseudogram_sv:standard_name = "pseudogram_sv" ;
                pseudogram_sv:platform = "platform" ;
                pseudogram_sv:observation_type = "measured" ;
                pseudogram_sv:coordinates = "pseudogram_time pseudogram_depth" ;
                pseudogram_sv:echosounderRangeBins = 20LL ;
                pseudogram_sv:echosounderRange = 60. ;
                pseudogram_sv:echosounderRangeUnits = "meters" ;
                pseudogram_sv:echosounderDirection = "up" ;

The acoustics has two components with separate serial numbers. Added acoustics instrument as:

        int instrument_acoustics ;
                instrument_acoustics:_FillValue = 0 ;
                instrument_acoustics:serial_number = "269615" ;
                instrument_acoustics:make_model = "Simrad WBT Mini" ;
                instrument_acoustics:serial_number_2 = "167" ;
                instrument_acoustics:make_model_2 = "ES200-CDK-split" ;
                instrument_acoustics:comment = "Slocum Glider UAF G507" ;
                instrument_acoustics:long_name = "Kongsberg Simrad WBT Mini" ;
                instrument_acoustics:mode_operation = "EK80" ;
                instrument_acoustics:calibration_date = "" ;
                instrument_acoustics:factory_calibrated = "" ;
                instrument_acoustics:calibration_report = "" ;
                instrument_acoustics:platform = "platform" ;
                instrument_acoustics:type = "instrument" ;

If this is ok, I can look at removing the hard coded options.

jr3cermak commented 2 years ago

Moved config options to the instrument since it impacts all the eco* variables.

        int instrument_acoustics ;
                instrument_acoustics:_FillValue = 0 ;
                instrument_acoustics:serial_number = "269615" ;
                instrument_acoustics:make_model = "Simrad WBT Mini" ;
                instrument_acoustics:serial_number_2 = "167" ;
                instrument_acoustics:make_model_2 = "ES200-CDK-split" ;
                instrument_acoustics:comment = "Slocum Glider UAF G507" ;
                instrument_acoustics:long_name = "Kongsberg Simrad WBT Mini" ;
                instrument_acoustics:mode_operation = "EK80" ;
                instrument_acoustics:echosounderRangeBins = 20LL ;
                instrument_acoustics:echosounderRange = 60. ;
                instrument_acoustics:echosounderRangeUnits = "meters" ;
                instrument_acoustics:echosounderDirection = "up" ;
                instrument_acoustics:calibration_date = "" ;
                instrument_acoustics:factory_calibrated = "" ;
                instrument_acoustics:calibration_report = "" ;
                instrument_acoustics:platform = "platform" ;
                instrument_acoustics:type = "instrument" ;
jr3cermak commented 2 years ago

Think about grouping these so other features can be added later and not get mixed up with other provided keywords.

replace:

    "extra_kwargs": {
        "enable_pseudograms": true,
        "echosounderRange": 60.0,
        "echosounderRangeBins": 20,
        "echosounderDirection": "up",
        "echosounderRangeUnits": "meters"
    },

with?

    "extra_kwargs": {
        "pseudograms": {
               "enable": true,
               "echosounderRange": 60.0,
               "echosounderRangeBins": 20,
               "echosounderDirection": "up",
               "echosounderRangeUnits": "meters"
        }
    },
kwilcox commented 2 years ago

Grouping the kwargs is a great idea... extras can be used to do anything and isn't restricted to pseudogram things.

jr3cermak commented 2 years ago

Current tasks:

Interim testing:

kwilcox commented 2 years ago

@jr3cermak I played around with hosting the datasets as-in (with the extras dimension) and it won't currently work with the DAC's setup since they are on an old version of ERDDAP. Even if they did upgrade their ERDDAP version it still doesn't work wonderfully. Requesting a subset of data where variables are dimensioned by both time and extras fails to return data. I'm sure this is something Bob Simons could advise on, but for now, we have 2 options:

  1. Remove the extras dimension and put the pseudogram data directly into the time dimension. We did this at one point, but I likely suggested splitting it out. IMO the extras dimension is much more correct.
  2. Bypass the DAC and get the pseudogram data into the AOOS data system another way. The profile netCDF files will not include the pseudogram data and it will only be available through the AOOS data portal. It won't be archived with the glider data through NCEI.
jr3cermak commented 2 years ago

I also experimented with storing the pseudogram with the time dimension. Since the pseudogram time coordinates are different from the CTD profile, the resultant netCDF files became very large. So, I would say writing the pseudogram data out to a separate file sounds like the best option at the moment.

kwilcox commented 2 years ago

:unamused:

Here is an ERDDAP Dataset that just serves the pseudogram data. I'm playing with some ideas to get this into AOOS, stay tuned.

<dataset type="EDDTableFromMultidimNcFiles" datasetID="unit_507_pseudogram" active="true">
        <!-- defaultDataQuery uses datasetID -->
        <!--
                    <defaultDataQuery>&amp;trajectory=extras_test-20220329T0000</defaultDataQuery>
                    <defaultGraphQuery>longitude,latitude,time&amp;.draw=markers&amp;.marker=2|5&.color=0xFFFFFF&.colorBar=|||||</defaultGraphQuery>
                    -->
        <reloadEveryNMinutes>1440</reloadEveryNMinutes>
        <updateEveryNMillis>-1</updateEveryNMillis>
        <!-- use datasetID as the directory name -->
        <fileDir>/datasets/gliders/ecodroid2</fileDir>
        <recursive>false</recursive>
        <fileNameRegex>.*\.nc</fileNameRegex>
        <metadataFrom>last</metadataFrom>
        <sortedColumnSourceName>pseudogram_time</sortedColumnSourceName>
        <sortFilesBySourceNames>trajectory pseudogram_time</sortFilesBySourceNames>
        <fileTableInMemory>false</fileTableInMemory>
        <accessibleViaFiles>true</accessibleViaFiles>
        <addAttributes>
            <att name="cdm_data_type">trajectoryProfile</att>
            <att name="featureType">trajectoryProfile</att>
            <!-- <att name="cdm_altitude_proxy">pseudogram_depth</att> -->
            <att name="cdm_trajectory_variables">trajectory,wmo_id</att>
            <att name="cdm_profile_variables">profile_id,profile_time,latitude,longitude</att>
            <att name="subsetVariables">trajectory,wmo_id,profile_id,profile_time,latitude,longitude</att>
            <att name="Conventions">Unidata Dataset Discovery v1.0, COARDS, CF-1.6</att>
            <att name="keywords">AUVS &gt; Autonomous Underwater Vehicles, Oceans &gt; Ocean Pressure &gt; Water Pressure, Oceans &gt; Ocean Temperature &gt; Water Temperature, Oceans &gt; Salinity/Density &gt; Conductivity, Oceans &gt; Salinity/Density &gt; Density, Oceans &gt; Salinity/Density &gt; Salinity, glider, In Situ Ocean-based platforms &gt; Seaglider, Spray, Slocum, trajectory, underwater glider, water, wmo</att>
            <att name="keywords_vocabulary">GCMD Science Keywords</att>
            <att name="Metadata_Conventions">Unidata Dataset Discovery v1.0, COARDS, CF-1.6</att>
            <att name="sourceUrl">(local files)</att>
            <att name="infoUrl">https://gliders.ioos.us/erddap/</att>
            <!-- title=datasetID -->
            <att name="title">unit_507-20220212T0000_pseudogram</att>
            <att name="ioos_dac_checksum">sdfsdf</att>
            <att name="ioos_dac_completed">False</att>
            <att name="gts_ingest">true</att>
        </addAttributes>

        <dataVariable>
            <sourceName>trajectory</sourceName>
            <destinationName>trajectory</destinationName>
            <dataType>String</dataType>
            <addAttributes>
                <att name="comment">A trajectory is one deployment of a glider.</att>
                <att name="ioos_category">Identifier</att>
                <att name="long_name">Trajectory Name</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>global:wmo_id</sourceName>
            <destinationName>wmo_id</destinationName>
            <dataType>String</dataType>
            <addAttributes>
                <att name="ioos_category">Identifier</att>
                <att name="long_name">WMO ID</att>
                <att name="missing_value" type="string">none specified</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_id</sourceName>
            <destinationName>profile_id</destinationName>
            <dataType>int</dataType>
            <addAttributes>
                <att name="cf_role">profile_id</att>
                <att name="ioos_category">Identifier</att>
                <att name="long_name">Profile ID</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_time</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Time</att>
                <att name="long_name">Profile Time</att>
                <att name="comment">Timestamp corresponding to the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_lat</sourceName>
            <destinationName>latitude</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="colorBarMaximum" type="double">90.0</att>
                <att name="colorBarMinimum" type="double">-90.0</att>
                <att name="valid_max" type="double">90.0</att>
                <att name="valid_min" type="double">-90.0</att>
                <att name="ioos_category">Location</att>
                <att name="long_name">Profile Latitude</att>
                <att name="comment">Value is interpolated to provide an estimate of the latitude at the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>profile_lon</sourceName>
            <destinationName>longitude</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="colorBarMaximum" type="double">180.0</att>
                <att name="colorBarMinimum" type="double">-180.0</att>
                <att name="valid_max" type="double">180.0</att>
                <att name="valid_min" type="double">-180.0</att>
                <att name="ioos_category">Location</att>
                <att name="long_name">Profile Longitude</att>
                <att name="comment">Value is interpolated to provide an estimate of the longitude at the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>pseudogram_time</sourceName>
            <destinationName>time</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Time</att>
                <att name="long_name">Profile Time</att>
                <att name="comment">Timestamp corresponding to the mid-point of the profile.</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>pseudogram_depth</sourceName>
            <destinationName>depth</destinationName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="colorBarMaximum" type="double">2000.0</att>
                <att name="colorBarMinimum" type="double">0.0</att>
                <att name="colorBarPalette">OceanDepth</att>
                <att name="ioos_category">Location</att>
                <att name="long_name">Depth</att>
            </addAttributes>
            </dataVariable>
        <dataVariable>
            <sourceName>pseudogram_sv</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>

        <dataVariable>
            <sourceName>sci_echodroid_aggindex</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_ctrmass</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_eqarea</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_inertia</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_propocc</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_sa</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
        <dataVariable>
            <sourceName>sci_echodroid_sv</sourceName>
            <dataType>double</dataType>
            <addAttributes>
                <att name="ioos_category">Other</att>
            </addAttributes>
        </dataVariable>
    </dataset>
jr3cermak commented 2 years ago

I am just rounding the corner where I can almost get the latest glider deployment loaded under ERDDAP. I can see it is complaining about something. This is the combined case. It does not seem happy at all with the extras dimension.

*** constructing EDDTableFromFiles unit_507_combined
dir/file table doesn't exist: /erddapData/dataset/ed/unit_507_combined/dirTable.nc
dir/file table doesn't exist: /erddapData/dataset/ed/unit_507_combined/fileTable.nc
creating new dirTable and fileTable (dirTable=null?true fileTable=null?true badFileMap=null?false)
doQuickRestart=false
574 files found in /data/combined/
regex=.*\.nc recursive=false pathRegex=.* time=22ms
old nBadFiles size=0
old fileTable size=0   nFilesMissing=0
Didn't get expected attributes because there were no previously valid files,
  or none of the previously valid files were unchanged!
EDDTableFromFiles file #0=/data/combined/G507_1644626730_20220212T004530Z_rt.nc
0 insert in fileList
0 bad file: removing fileTable row for /data/combined/G507_1644626730_20220212T004530Z_rt.nc
java.lang.RuntimeException: 
ERROR in Test.ensureEqual(Strings) line cf-convention/discuss#1, col cf-convention/discuss#1 'e[end]'!='t[end]':
ERROR in Table.readNDNc /data/combined/G507_1644626730_20220212T004530Z_rt.nc:
Unexpected axis#0 for variable=pseudogram_depth
Specifically, at line cf-convention/discuss#1, col cf-convention/discuss#1:
s1: extras[end]
s2: time[end]
    ^

 at com.cohort.util.Test.error(Test.java:43)
 at com.cohort.util.Test.ensureEqual(Test.java:340)
 at gov.noaa.pfel.coastwatch.pointdata.Table.readNDNc(Table.java:7021)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromNcFiles.lowGetSourceDataFromFile(EDDTableFromNcFiles.java:211)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.getSourceDataFromFile(EDDTableFromFiles.java:3270)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.<init>(EDDTableFromFiles.java:1543)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromNcFiles.<init>(EDDTableFromNcFiles.java:130)
 at gov.noaa.pfel.erddap.dataset.EDDTableFromFiles.fromXml(EDDTableFromFiles.java:503)
 at gov.noaa.pfel.erddap.dataset.EDD.fromXml(EDD.java:457)
 at gov.noaa.pfel.erddap.LoadDatasets.run(LoadDatasets.java:359)
netcdf G507_1644626730_20220212T004530Z_rt {
dimensions:
        time = 78 ;
        extras = 651 ;
...
        double pseudogram_time(extras) ;
                pseudogram_time:_FillValue = -9999.9 ;
                pseudogram_time:units = "seconds since 1990-01-01 00:00:00Z" ;
                pseudogram_time:calendar = "standard" ;
                pseudogram_time:long_name = "Pseudogram Time" ;
                pseudogram_time:ioos_category = "Other" ;
                pseudogram_time:standard_name = "pseudogram_time" ;
                pseudogram_time:platform = "platform" ;
                pseudogram_time:observation_type = "measured" ;

That test looks suspicious... e[end]'!='t[end]. It almost looks like it wants the extra dimension to also start and end with the same timestamp?

Onto the separated case...

jr3cermak commented 2 years ago

Resync branch after PR cf-convention/vocabularies#99 and carry on.

jr3cermak commented 2 years ago

Resync with master to take a look at the new pathway.

jr3cermak commented 2 years ago

Running the latest deployment through the current code shows a single netcdf file now. Are the profiles combined?

This is quite different than what was shown in an earlier email with the tabledap link: https://gliders.ioos.us/erddap/tabledap/extras_test-20220329T0000.htmlTable?trajectory%2Cwmo_id%2Cprofile_id%2Ctime%2Clatitude%2Clongitude%2Cdepth%2Cpseudogram_depth%2Cpseudogram_sv%2Cpseudogram_time%2Csci_echodroid_aggindex%2Csci_echodroid_ctrmass%2Csci_echodroid_eqarea%2Csci_echodroid_inertia%2Csci_echodroid_propocc%2Csci_echodroid_sa%2Csci_echodroid_sv&time%3E=2021-12-02T00%3A00%3A00Z&time%3C=2021-12-09T17%3A33%3A35Z that references: https://gliders.ioos.us/erddap/files/extras_test-20220329T0000/

On the DAC for unit_507, there are two separate sets of files *_rt.nc and the _extra_rt.nc: https://gliders.ioos.us/erddap/files/unit_507-20220212T0000/

It looks like the pseudogram is folded back into the profiles as a single file now.

jr3cermak commented 2 years ago

The latest master of GUTILS is great for backend storage of echodroid/pseudogram data.

Tossing the _extra_rt.nc behind an aggregated netCDF dataset in thredds allows for full deployment plotting.

  <dataset name="Glider extras" ID="Gretel-NC-extra" urlPath="GretelExtra.nc">
    <serviceName>all</serviceName>
    <netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
      <aggregation dimName="time" type="joinExisting">
        <scan location="/home/cermak/glider/ecometrics6/rt/netcdf/" suffix="_rt_extra.nc" subdirs="false"/>
      </aggregation>
    </netcdf>
  </dataset>

Python code to pull from the aggregation just for reference.

#$ cat plotGretelDepl2.py 
import io, os, sys, struct, datetime
import subprocess
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib.figure import Figure
from matplotlib.colors import LinearSegmentedColormap, Colormap
import matplotlib.dates as dates
import matplotlib.ticker as mticker
from matplotlib.patches import Rectangle
import json
import xarray as xr
#get_ipython().run_line_magic('matplotlib', 'inline')

def newFigure(figsize = (10,8), dpi = 100):

    fig = Figure(figsize=figsize, dpi=dpi)

    return fig

# Fetch Sv data
def fetchSv(start_time, end_time, ds):
    # Copy data into a numpy array and resort Sv(dB) values for plotting
    # Convert TS to string
    # datetime.datetime.strftime(datetime.datetime.utcfromtimestamp(dt), "%Y-%m-%d %H:%M:%S.%f")
    # Convert string to TS
    # datetime.datetime.strptime(dtSTR, "%Y-%m-%d %H:%M:%S.%f").timestamp()
    time_dim = 'time'
    sv_ts    = np.unique(ds[time_dim])

    startDTTM = start_time
    #startVal = datetime.datetime.strptime(startDTTM, "%Y-%m-%d %H:%M:%S.%f").timestamp()
    startVal = np.datetime64(datetime.datetime.strptime(startDTTM, "%Y-%m-%d %H:%M:%S.%f"))
    endDTTM = end_time
    #endVal = datetime.datetime.strptime(endDTTM, "%Y-%m-%d %H:%M:%S.%f").timestamp()
    endVal = np.datetime64(datetime.datetime.strptime(endDTTM, "%Y-%m-%d %H:%M:%S.%f"))

    # This obtains time indicies for the unique time values
    a = np.abs(sv_ts-startVal).argmin()
    b = np.abs(sv_ts-endVal).argmin()
    #print(a,b)

    #print(time_array.shape)
    #print(list(sv.variables))
    # https://xarray.pydata.org/en/v0.11.0/time-series.html
    sv_data  = ds['pseudogram_sv'].sel(time=slice(pd.Timestamp(sv_ts[a]),pd.Timestamp(sv_ts[b])))
    sv_time  = [pd.Timestamp(t.values).timestamp() for t in ds[time_dim].sel(time=slice(pd.Timestamp(sv_ts[a]),pd.Timestamp(sv_ts[b])))]
    sv_depth = ds['depth'].sel(time=slice(pd.Timestamp(sv_ts[a]),pd.Timestamp(sv_ts[b])))

    return (sv_time, sv_depth, sv_data)

# Make plots from intermediate deployment data
def makePlot(sv_time, sv_depth, sv_data):
    # Set the default SIMRAD EK500 color table plus grey for NoData.
    simrad_color_table = [(1, 1, 1),
        (0.6235, 0.6235, 0.6235),
        (0.3725, 0.3725, 0.3725),
        (0, 0, 1),
        (0, 0, 0.5),
        (0, 0.7490, 0),
        (0, 0.5, 0),
        (1, 1, 0),
        (1, 0.5, 0),
        (1, 0, 0.7490),
        (1, 0, 0),
        (0.6509, 0.3255, 0.2353),
        (0.4705, 0.2353, 0.1568)]
    simrad_cmap = (LinearSegmentedColormap.from_list
        ('Simrad', simrad_color_table))
    simrad_cmap.set_bad(color='lightgrey')

    # Convert sv_time to something useful
    svData   = np.column_stack((sv_time, sv_depth, sv_data))

    # Filter out the noisy -5.0 and -10.0 data
    svData = np.where(svData == -5.0, -60.0, svData)
    svData = np.where(svData == -15.0, -60.0, svData)

    # Sort Sv(dB) from lowest to highest so higher values are plotted last
    svData = svData[np.argsort(svData[:,2])]

    # Plot simply x, y, z data (time, depth, dB)
    #fig, ax = plt.subplots(figsize=(10,8))
    fig = newFigure()
    ax = fig.subplots()

    #ax.xaxis.set_minor_locator(dates.MinuteLocator(interval=10))   # every 10 minutes
    #ax.xaxis.set_minor_locator(dates.HourLocator(interval=3))   # every 3 hours
    #ax.xaxis.set_minor_formatter(dates.DateFormatter('%H'))  # hours
    #ax.xaxis.set_minor_formatter(dates.DateFormatter('%H:%M'))  # hours and minutes
    ax.xaxis.set_major_locator(dates.DayLocator(interval=2))    # every day
    #ax.xaxis.set_major_formatter(dates.DateFormatter('\n%m-%d-%Y'))
    ax.xaxis.set_major_formatter(dates.DateFormatter('%m/%d'))
    ax.tick_params(which='major', labelrotation=45)

    #ax.set_facecolor('lightgray')
    ax.set_facecolor('white')

    dateData = [datetime.datetime.fromtimestamp(ts) for ts in svData[:,0]]
    #im = plt.scatter(dateData, svData[:,1], c=svData[:,2], cmap=simrad_cmap, s=30.0)
    im = ax.scatter(dateData, svData[:,1], c=svData[:,2], cmap=simrad_cmap, s=30.0)

    #cbar = plt.colorbar(orientation='vertical', label='Sv (dB)', shrink=0.40)
    fig.colorbar(im, orientation='vertical', label='Sv (dB)', shrink=0.40)

    #plt.ylim(0, sv_depth.max())

    #plt.gca().invert_yaxis()

    #plt.ylabel('Depth (m)')
    #plt.xlabel('Date (UTC)')
    ax.set(ylim=[0, sv_depth.max()], xlabel='Date (UTC)', ylabel='Depth (m)')
    #plt.clim(0, -55)
    im.set_clim(0, -55)

    # Invert axis after limits are set
    im.axes.invert_yaxis()
    #plt.title("Acoustic Scattering Volume (dB) Pseudogram")
    ax.set_title("Acoustic Scattering Volume (dB) Pseudogram")

    return fig, ax

ds = xr.open_dataset('http://mom6node0:8080/thredds/dodsC/GretelExtra.nc')

# Find the timespan of the dataset
ts_min = ds['time'].min()
ts_max = ds['time'].max()

# use the entire deployment

start_dt_string = str(ts_min.dt.strftime("%Y-%m-%d %H:%M:%S.%f").values)
end_dt_string = str(ts_max.dt.strftime("%Y-%m-%d %H:%M:%S.%f").values)

(sv_time, sv_depth, sv_data) = fetchSv(start_dt_string, end_dt_string, ds)

if len(sv_data) > 100:
    (fig, ax) = makePlot(sv_time, sv_depth, sv_data)

    imageOut = "Sv_%s_all.png" % (str(ts_min.dt.strftime("%Y%m%d").values))
    fig.savefig(imageOut, bbox_inches='tight', dpi=100)

ds.close()
kwilcox commented 2 years ago

Nice, an added benefit I didn't even think about!

jr3cermak commented 1 year ago

Cycling back around to provide an update to support future deployments. Will resync with master and move forward. Please let me know what things you need to support of echometrics, low resolution / echograms (formerly pseudograms). This update will provide:

Because of the two stage processing of GUTILS, to provide a data frame, the 2nd pass script would have to provide the DBD files and the cache file directory to decode and provide a direct data frame object. Otherwise, continue to use the 1st pass and produce the csv file and then read the csv file in the 2nd pass to recover the data frame (kinda of what happens now). The intermediate output file can be anything -- a pickled object with the data frame needed in the 2nd stage, etc.

We need to know what target(s) to hit for you so we can get them built into the CI testing. Once it all passes again, move ahead with other fun things. It looks like python 3.7 is EOL. Is there a particular version of python we should use? We are at the stage of reworking the tests and updating code. I am anticipating at least two to four weeks of additional effort on our side before a reasonable PR is ready. This could change based on the requirements/targets provided.

jr3cermak commented 1 year ago

Main branch readme => python 3.9 :)

jr3cermak commented 1 year ago

Unfortunately, our work has snowballed a bit. So, we will need to submit at least three PRs in total as of this writing. The first is ready to go when CI tests pass.

jr3cermak commented 1 year ago

Latest checks have passed. I have refreshed documentation in the README.pdf and have it out on a website (that may be down at some point for an OS update).

https://nasfish.fish.washington.edu/echotools/docs/html/echotools/html/echotools/README.html

The important bit is walking from the produced netCDF files (*_extra.nc) to a time series plot of the echogram profiles given any time range. So, I think that is the target product that is desired on the data portal.

https://nasfish.fish.washington.edu/echotools/docs/html/echotools/html/echotools/README.html#product

That should give us the pivot point to start heading down the pyarrow rabbit hole.

jr3cermak commented 1 year ago

Just a little more work on some additional "profile" products for echometrics. We stood up a prototype that will be used internally once implemented in some fashion on the data portal.

https://nasfish.fish.washington.edu/echotools/dppp/egramBrowser/portal.html

kwilcox commented 1 year ago

Ready for me to take a look?

jr3cermak commented 1 year ago

There is at least one more pending update with additional "profile" products to be sent. I will post another note when things settled.

jr3cermak commented 1 year ago

You can move ahead with the current code in the PR. This other new part needs some more R&D before it can be implemented. I originally thought it was going to be an easy drop in addition. That is not the case.

jcermauwedu commented 5 months ago

A proposal for additional CF standard names has been submitted to improve standards compliance for proposed acoustic datasets. For future use in deployment.json and other configuration files.