NOAA-OWP / wres

Code and scripts for the Water Resources Evaluation Service
Other
2 stars 1 forks source link

As a developer, I want to understand why there are two identical requests to obtain location information from WRDS and, if the second call is unnecessary, remove it #283

Open epag opened 2 months ago

epag commented 2 months ago

Author Name: Hank (Hank) Original Redmine Issue: 98035, https://vlab.noaa.gov/redmine/issues/98035 Original Date: 2021-10-27


The declaration is below. The log messages indicating two calls:

2021-10-26T19:13:15.513+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nwm_feature_id/5907079
2021-10-26T19:13:16.664+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/usgs_site_code/01631000
2021-10-26T19:13:17.272+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2
2021-10-26T19:13:17.800+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2

I don't think two calls should be necessary, but it doesn't do any harm from what I can tell, so this is a normal priority item.

Thanks,

Hank

EDIT: If you remove the @featureGroup@, below, and, instead, just evaluation WALN6 and FROV2 as singletons, the duplicate calls are still seen. So its not a feature group byproduct.

===================================================

<?xml version='1.0' encoding='UTF-8'?>
<project>
  <inputs>
    <left>
      <type>observations</type>
      <source interface="usgs_nwis">https://nwis.waterservices.usgs.gov/nwis/iv</source>
      <variable>00060</variable>
    </left>
    <right>
      <type>single valued forecasts</type>
      <source interface="wrds_nwm">https://nwcal-wrds.[host]/api/nwm2.1/v2.0/ops/medium_range</source>
      <variable>streamflow</variable>
    </right>
    <baseline featureDimension="nws_lid" label="WRDS AHPS API Single Valued Streamflow Forecasts">
      <type>single valued forecasts</type>
      <source interface="wrds_ahps">https://nwcal-wrds.[host]/api/rfc_forecast/v2.0/forecast/streamflow</source>
      <variable>QR</variable>
    </baseline>

  </inputs>
  <pair>
    <unit>ft3/s</unit>
    <featureService>
      <baseUrl>https://nwcal-wrds.[host]/api/location/v3.0/metadata</baseUrl>
    </featureService>
    <feature left="01631000" right="5907079"/>

        <featureGroup name = "A FEW FEATURES">
            <feature baseline="FROV2"/>
            <feature baseline="WALN6"/>
        </featureGroup>

    <leadHours minimum="0" maximum="240"/>
    <dates earliest="2021-09-27T00:00:00Z" latest="2021-10-07T00:00:00Z"/>
    <issuedDates earliest="2021-09-27T00:00:00Z" latest="2021-10-01T00:00:00Z"/>
    <leadTimesPoolingWindow>
      <period>3</period>
      <unit>hours</unit>
    </leadTimesPoolingWindow>
  </pair>
  <metrics>
    <thresholds>
      <type>value</type>
      <applyTo>left</applyTo>
      <source format="wrds" parameterToMeasure="flow" featureNameFrom="baseline" >https://nwcal-wrds.[host]/api/location/v3.0/nws_threshold/nws_lid</source>
      <operator>greater than or equal to</operator>
    </thresholds>

    <metric><name>sample size</name></metric>
    <metric><name>pearson correlation coefficient</name></metric>
  </metrics>
  <outputs>
    <destination type="graphic" />
    <destination type="csv" />
    <destination type="csv2" />
    <destination type="pairs"/>
  </outputs>
</project>
</code>
epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-10-27T11:54:51Z


Speculation in #96392-70. First thing I would check is whether the same thing happens without the feature group, i.e. add them as singletons - I suspect it does (and hence is unrelated to feature groups).

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2021-10-27T11:56:57Z


James: Posting the below since I wrote it while you wrote your message. I'll try the singleton run in a bit. I want to spin up another run of a feature group with thresholds, first.

Here is what I wrote...

I ran the evaluation in debug for #96392 and noted these log messages:

2021-10-26T19:48:55.960+0000 DEBUG FeatureFinder Discovered 1 feature groups with features to densify.
2021-10-26T19:48:55.960+0000 DEBUG FeatureFinder These need lookups: {(NWS_LID,USGS_SITE_CODE)=[WALN6, FROV2], (NWS_LID,NWM_FEATURE_ID)=[WALN6, FROV2]}

I wasn't sure why there were two instants of "[WALN6, FROV2]" in the "These need lookups" list. I also posted additional logging below. Later, James wondered if this is expected:

Oh, actually, perhaps it makes sense if the correlations are sought in pairs? In other words, find the left/right feature correlations for the usgs site codes with nws feature names WALN6, FROV2, then find the left/baseline feature correlations for the nwm feature identifiers with the nws feature names WALN6, FROV2. Something like that, perhaps?

James also recommended,

To debug this easily, we'd only want debug enabled on wres.io.geography. You can edit the logback.xml to do that.

I'll do that in the future. For now, I'm focused on testing the feature group capability with thresholds. Leaving this ticket,

Hank

==========================================================================

2021-10-26T19:48:55.960+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2
2021-10-26T19:48:55.961+0000 DEBUG WebClient getFromWeb https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2
2021-10-26T19:48:55.961+0000 DEBUG WebClient Called tryRequest with Request{method=GET, url=https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2, headers=[Accept-Encoding:gzip, User-Agent:wres-io/20211026-dc3baff]}
2021-10-26T19:48:56.236+0000 DEBUG WebClient For https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2, request headers are Accept-Encoding: gzip
User-Agent: wres-io/20211026-dc3baff
 response headers are server: nginx/1.21.3
date: Tue, 26 Oct 2021 19:49:07 GMT
content-type: application/json
vary: Accept, Cookie, Origin
allow: GET, HEAD, OPTIONS
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=31536000;includeSubDomains
content-encoding: gzip

2021-10-26T19:48:56.236+0000 DEBUG WebClient Successfully got InputStream from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2 in PT0.27502S
2021-10-26T19:48:56.236+0000 DEBUG WrdsFeatureService Raw response, decoded as UTF-8: {"_metrics":{"location_count":2,"model_tracing_api_call":0.011864662170410156,"total_request_time":0.2672419548034668},"_warnings":[],"_documentation":{"swagger URL":"http://nwcal-wrds.[host]/docs/location/v3.0/swagger/"},"deployment":{"api_url":"https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2/","stack":"prod","version":"v3.3.0","api_caller":"None"},"data_sources":{"metadata_sources":["NWS data: NRLDB - Last updated: 2021-10-05 03:54:41 UTC","USACE data: USACE - Last updated: 2021-07-26 14:19:33 UTC","USGS data: USGS NWIS - Last updated: 2021-10-05 03:18:43 UTC"],"crosswalk_datasets":{"location_nwm_crosswalk_dataset":{"location_nwm_crosswalk_dataset_id":"1.1","name":"Location NWM Crosswalk v1.1","description":"Created 20210524.  Source 1) NWM Routelink File v2.1-corrected   2) NHDPlus v2.1+    3) GID"},"nws_usgs_crosswalk_dataset":{"nws_usgs_crosswalk_dataset_id":"1.0","name":"NWS Station to USGS Gages 1.0","description":"Authoritative 1.0 dataset mapping NWS Stations to USGS Gages"}}},"locations":[{"identifiers":{"nws_lid":"FROV2","usgs_site_code":"01631000","nwm_feature_id":"5907079","goes_id":"DD501084","env_can_gage_id":null},"nws_data":{"name":"Front Royal","wfo":"LWX","rfc":"MARFC","geo_rfc":"MARFC","latitude":38.913888888889,"longitude":-78.211111111111,"map_link":"https://maps.google.com/maps?t=k&q=loc:38.913888888889+-78.211111111111","horizontal_datum_name":"NAD27","state":"Virginia","county":"Warren","county_code":51187,"huc":"02070005","hsa":"LWX","zero_datum":468.93,"vertical_datum_name":"NAVD88","rfc_forecast_point":true,"rfc_defined_fcst_point":true,"riverpoint":true},"usgs_data":{"name":"S F SHENANDOAH RIVER AT FRONT ROYAL, VA","geo_rfc":"MARFC","latitude":38.91400059,"longitude":-78.21083388,"map_link":"https://maps.google.com/maps?t=k&q=loc:38.91400059+-78.21083388","coord_accuracy_code":"U","latlon_datum_name":"NAD83","coord_method_code":"M","state":"Virginia","huc":"02070005","site_type":"ST","altitude":468.93,"alt_accuracy_code":0.01,"alt_datum_code":"NAVD88","alt_method_code":"D","drainage_area":1634.0,"drainage_area_units":"square miles","contrib_drainage_area":null,"active":true,"gages_ii_reference":false},"nwm_feature_data":{"downstream_feature_id":5907071,"latitude":38.90788,"longitude":-78.21186,"altitude":144.19,"stream_length":6480.0,"stream_order":5,"mannings_roughness":0.05,"slope":0.00046,"channel_side_slope":0.2249492,"nhd_waterbody_comid":null},"env_can_gage_data":{"name":null,"latitude":null,"longitude":null,"map_link":null,"drainage_area":null,"contrib_drainage_area":null,"water_course":null},"nws_preferred":{"name":"Front Royal","latitude":38.913888888889,"longitude":-78.211111111111,"latlon_datum_name":"NAD27","state":"Virginia","huc":"02070005"},"usgs_preferred":{"name":"S F SHENANDOAH RIVER AT FRONT ROYAL, VA","latitude":38.91400059,"longitude":-78.21083388,"latlon_datum_name":"NAD83","state":"Virginia","huc":"02070005"},"upstream_nwm_features":["5907077"],"downstream_nwm_features":["5907071"]},{"identifiers":{"nws_lid":"WALN6","usgs_site_code":"01423000","nwm_feature_id":"2613578","goes_id":"DDA9F73E","env_can_gage_id":null},"nws_data":{"name":"Walton","wfo":"BGM","rfc":"MARFC","geo_rfc":"MARFC","latitude":42.166111111111,"longitude":-75.140277777778,"map_link":"https://maps.google.com/maps?t=k&q=loc:42.166111111111+-75.140277777778","horizontal_datum_name":"UNK","state":"New York","county":"Delaware","county_code":36025,"huc":"02040101","hsa":"BGM","zero_datum":1189.88,"vertical_datum_name":"NAVD88","rfc_forecast_point":true,"rfc_defined_fcst_point":true,"riverpoint":true},"usgs_data":{"name":"WEST BRANCH DELAWARE RIVER AT WALTON NY","geo_rfc":"MARFC","latitude":42.1661111,"longitude":-75.14002778,"map_link":"https://maps.google.com/maps?t=k&q=loc:42.1661111+-75.14002778","coord_accuracy_code":"1","latlon_datum_name":"NAD83","coord_method_code":"N","state":"New York","huc":"02040101","site_type":"ST","altitude":1189.88,"alt_accuracy_code":0.01,"alt_datum_code":"NAVD88","alt_method_code":"L","drainage_area":332.0,"drainage_area_units":"square miles","contrib_drainage_area":null,"active":true,"gages_ii_reference":true},"nwm_feature_data":{"downstream_feature_id":2613590,"latitude":42.15888,"longitude":-75.14107,"altitude":360.38,"stream_length":1589.0,"stream_order":4,"mannings_roughness":0.055,"slope":0.0005,"channel_side_slope":0.2855113,"nhd_waterbody_comid":null},"env_can_gage_data":{"name":null,"latitude":null,"longitude":null,"map_link":null,"drainage_area":null,"contrib_drainage_area":null,"water_course":null},"nws_preferred":{"name":"Walton","latitude":42.166111111111,"longitude":-75.140277777778,"latlon_datum_name":"UNK","state":"New York","huc":"02040101"},"usgs_preferred":{"name":"WEST BRANCH DELAWARE RIVER AT WALTON NY","latitude":42.1661111,"longitude":-75.14002778,"latlon_datum_name":"NAD83","state":"New York","huc":"02040101"},"upstream_nwm_features":["2613508","2613510"],"downstream_nwm_features":["2613590"]}]}
2021-10-26T19:48:56.238+0000 DEBUG WrdsFeatureService For from=NWS_LID and to=USGS_SITE_CODE, found these: {WALN6=01423000, FROV2=01631000}
2021-10-26T19:48:56.238+0000 DEBUG WrdsFeatureService Last of the feature names to request: [WALN6, FROV2]
2021-10-26T19:48:56.238+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2
2021-10-26T19:48:56.238+0000 DEBUG WebClient getFromWeb https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2
2021-10-26T19:48:56.238+0000 DEBUG WebClient Called tryRequest with Request{method=GET, url=https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2, headers=[Accept-Encoding:gzip, User-Agent:wres-io/20211026-dc3baff]}
2021-10-26T19:48:56.478+0000 DEBUG WebClient For https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2, request headers are Accept-Encoding: gzip
User-Agent: wres-io/20211026-dc3baff
 response headers are server: nginx/1.21.3
date: Tue, 26 Oct 2021 19:49:07 GMT
content-type: application/json
vary: Accept, Cookie, Origin
allow: GET, HEAD, OPTIONS
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=31536000;includeSubDomains
content-encoding: gzip

2021-10-26T19:48:56.479+0000 DEBUG WebClient Successfully got InputStream from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2 in PT0.240394S
2021-10-26T19:48:56.479+0000 DEBUG WrdsFeatureService Raw response, decoded as UTF-8: {"_metrics":{"location_count":2,"model_tracing_api_call":0.00910496711730957,"total_request_time":0.2330772876739502},"_warnings":[],"_documentation":{"swagger URL":"http://nwcal-wrds.[host]/docs/location/v3.0/swagger/"},"deployment":{"api_url":"https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2/","stack":"prod","version":"v3.3.0","api_caller":"None"},"data_sources":{"metadata_sources":["NWS data: NRLDB - Last updated: 2021-10-05 03:54:41 UTC","USACE data: USACE - Last updated: 2021-07-26 14:19:33 UTC","USGS data: USGS NWIS - Last updated: 2021-10-05 03:18:43 UTC"],"crosswalk_datasets":{"location_nwm_crosswalk_dataset":{"location_nwm_crosswalk_dataset_id":"1.1","name":"Location NWM Crosswalk v1.1","description":"Created 20210524.  Source 1) NWM Routelink File v2.1-corrected   2) NHDPlus v2.1+    3) GID"},"nws_usgs_crosswalk_dataset":{"nws_usgs_crosswalk_dataset_id":"1.0","name":"NWS Station to USGS Gages 1.0","description":"Authoritative 1.0 dataset mapping NWS Stations to USGS Gages"}}},"locations":[{"identifiers":{"nws_lid":"FROV2","usgs_site_code":"01631000","nwm_feature_id":"5907079","goes_id":"DD501084","env_can_gage_id":null},"nws_data":{"name":"Front Royal","wfo":"LWX","rfc":"MARFC","geo_rfc":"MARFC","latitude":38.913888888889,"longitude":-78.211111111111,"map_link":"https://maps.google.com/maps?t=k&q=loc:38.913888888889+-78.211111111111","horizontal_datum_name":"NAD27","state":"Virginia","county":"Warren","county_code":51187,"huc":"02070005","hsa":"LWX","zero_datum":468.93,"vertical_datum_name":"NAVD88","rfc_forecast_point":true,"rfc_defined_fcst_point":true,"riverpoint":true},"usgs_data":{"name":"S F SHENANDOAH RIVER AT FRONT ROYAL, VA","geo_rfc":"MARFC","latitude":38.91400059,"longitude":-78.21083388,"map_link":"https://maps.google.com/maps?t=k&q=loc:38.91400059+-78.21083388","coord_accuracy_code":"U","latlon_datum_name":"NAD83","coord_method_code":"M","state":"Virginia","huc":"02070005","site_type":"ST","altitude":468.93,"alt_accuracy_code":0.01,"alt_datum_code":"NAVD88","alt_method_code":"D","drainage_area":1634.0,"drainage_area_units":"square miles","contrib_drainage_area":null,"active":true,"gages_ii_reference":false},"nwm_feature_data":{"downstream_feature_id":5907071,"latitude":38.90788,"longitude":-78.21186,"altitude":144.19,"stream_length":6480.0,"stream_order":5,"mannings_roughness":0.05,"slope":0.00046,"channel_side_slope":0.2249492,"nhd_waterbody_comid":null},"env_can_gage_data":{"name":null,"latitude":null,"longitude":null,"map_link":null,"drainage_area":null,"contrib_drainage_area":null,"water_course":null},"nws_preferred":{"name":"Front Royal","latitude":38.913888888889,"longitude":-78.211111111111,"latlon_datum_name":"NAD27","state":"Virginia","huc":"02070005"},"usgs_preferred":{"name":"S F SHENANDOAH RIVER AT FRONT ROYAL, VA","latitude":38.91400059,"longitude":-78.21083388,"latlon_datum_name":"NAD83","state":"Virginia","huc":"02070005"},"upstream_nwm_features":["5907077"],"downstream_nwm_features":["5907071"]},{"identifiers":{"nws_lid":"WALN6","usgs_site_code":"01423000","nwm_feature_id":"2613578","goes_id":"DDA9F73E","env_can_gage_id":null},"nws_data":{"name":"Walton","wfo":"BGM","rfc":"MARFC","geo_rfc":"MARFC","latitude":42.166111111111,"longitude":-75.140277777778,"map_link":"https://maps.google.com/maps?t=k&q=loc:42.166111111111+-75.140277777778","horizontal_datum_name":"UNK","state":"New York","county":"Delaware","county_code":36025,"huc":"02040101","hsa":"BGM","zero_datum":1189.88,"vertical_datum_name":"NAVD88","rfc_forecast_point":true,"rfc_defined_fcst_point":true,"riverpoint":true},"usgs_data":{"name":"WEST BRANCH DELAWARE RIVER AT WALTON NY","geo_rfc":"MARFC","latitude":42.1661111,"longitude":-75.14002778,"map_link":"https://maps.google.com/maps?t=k&q=loc:42.1661111+-75.14002778","coord_accuracy_code":"1","latlon_datum_name":"NAD83","coord_method_code":"N","state":"New York","huc":"02040101","site_type":"ST","altitude":1189.88,"alt_accuracy_code":0.01,"alt_datum_code":"NAVD88","alt_method_code":"L","drainage_area":332.0,"drainage_area_units":"square miles","contrib_drainage_area":null,"active":true,"gages_ii_reference":true},"nwm_feature_data":{"downstream_feature_id":2613590,"latitude":42.15888,"longitude":-75.14107,"altitude":360.38,"stream_length":1589.0,"stream_order":4,"mannings_roughness":0.055,"slope":0.0005,"channel_side_slope":0.2855113,"nhd_waterbody_comid":null},"env_can_gage_data":{"name":null,"latitude":null,"longitude":null,"map_link":null,"drainage_area":null,"contrib_drainage_area":null,"water_course":null},"nws_preferred":{"name":"Walton","latitude":42.166111111111,"longitude":-75.140277777778,"latlon_datum_name":"UNK","state":"New York","huc":"02040101"},"usgs_preferred":{"name":"WEST BRANCH DELAWARE RIVER AT WALTON NY","latitude":42.1661111,"longitude":-75.14002778,"latlon_datum_name":"NAD83","state":"New York","huc":"02040101"},"upstream_nwm_features":["2613508","2613510"],"downstream_nwm_features":["2613590"]}]}
2021-10-26T19:48:56.480+0000 DEBUG WrdsFeatureService For from=NWS_LID and to=NWM_FEATURE_ID, found these: {WALN6=2613578, FROV2=5907079}
2021-10-26T19:48:56.480+0000 DEBUG FeatureFinder New left names: {wres.config.generated.Feature@4e51eda7[left=<null>(default), right=<null>(default), baseline=FROV2]=01631000, wres.config.generated.Feature@2862271a[left=<null>(default), right=<null>(default), baseline=WALN6]=01423000}
2021-10-26T19:48:56.480+0000 DEBUG FeatureFinder New right names: {wres.config.generated.Feature@4e51eda7[left=<null>(default), right=<null>(default), baseline=FROV2]=5907079, wres.config.generated.Feature@2862271a[left=<null>(default), right=<null>(default), baseline=WALN6]=2613578}
2021-10-26T19:48:56.480+0000 DEBUG FeatureFinder New baseline names: {}
2021-10-26T19:48:56.480+0000 DEBUG FeatureFinder Densified feature group A FEW FEATURES.
epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2021-10-27T13:37:31Z


Right you are, James. Its not related to feature groups. If I remove the group and include just the two features FROV2 and WALN6, I see this in the logs:

2021-10-27T13:31:04.125+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2
2021-10-27T13:31:04.540+0000 INFO WrdsFeatureService  Getting location data from https://nwcal-wrds.[host]/api/location/v3.0/metadata/nws_lid/WALN6,FROV2

So there is some redundancy in the calls, regardless. It would be good to remove the duplicate call to the service, but, again, not a high priority. Description and subject modified.

Hank

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-10-27T13:39:48Z


Right.

I'm not certain it is redundancy, actually, because it is requesting for two separate pairs of feature dimensions, even though one of them is fixed (nws) and hence the logging appears to show duplicate requests (edit: the @INFO@ logging could probably be a bit clearer about that, so I suppose that is one possible outcome of this ticket). They are not, actually, duplicate requests (edit: I mean from the perspective of wres - literally speaking, they are), but I'm not sure why they need to be requested as pairs, rather than tuples. Perhaps that is just how the feature service works.

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-10-27T13:45:35Z


In summary, needs input from Jesse. There may be scope to deduplicate here, minimally to clarify the log messages, but the reason for the apparently duplicate requests is that two pairs of feature dimensions are being resolved, both with the same (known) feature dimension hook of @nws_lid@, so it isn't a blatant bug, it is more complicated than that. edit: it would also be worth checking w/ 5.13 - same behavior? I assume, yes.

epag commented 2 months ago

Original Redmine Comment Author Name: Jesse (Jesse) Original Date: 2021-10-27T15:10:15Z


Confirmed in -dev it has the two identical requests to WRDS (-dev has the WRES 5.12) so this is pre-existing.

I see from your debug log a reason (not an excuse): @2021-10-26T19:48:56.238+0000 DEBUG WrdsFeatureService For from=NWS_LID and to=USGS_SITE_CODE, found these: {WALN6=01423000, FROV2=01631000}@ @2021-10-26T19:48:56.480+0000 DEBUG WrdsFeatureService For from=NWS_LID and to=NWM_FEATURE_ID, found these: {WALN6=2613578, FROV2=5907079}@

The API between @FeatureFinder@ and @WrdsFeatureService@ has a single @from@ and a single @to@:


    /**
     * Given a dimension "from" and dimension "to", look up the set of features.
     * @param projectConfig The declaration to use when printing error message.
     * @param featureService The featureService element, optional unless lookup
     *                       ends up being required.
     * @param from The known feature dimension, in which "featureNames" exist.
     * @param to The unknown feature dimension, the dimension to search in.
     * @param featureNames The names in the "from" dimension to look for in "to"
     * @return The Set of name pairs: "from" as key, "to" as value.
     * @throws ProjectConfigException When a feature service was needed but null
     * @throws PreIngestException When the count of features in response differs
     *                            from the count of feature names requested, or
     *                            when the requested "to" was not found in the
     *                            response.
     * @throws UnsupportedOperationException When unknown "from" or "to" given.
     * @throws NullPointerException When projectConfig or featureNames is null.
     */

    static Map<String, String> bulkLookup( ProjectConfig projectConfig,
                                           FeatureService featureService,
                                           FeatureDimension from,
                                           FeatureDimension to,
                                           Set<String> featureNames )
</code>

So the same "from" was used twice but not the same "to." It is an opportunity for deduplication of requests. I guess you would want to make the bulkLookup accept a @List@ and then further complicate the @FeatureFinder@ to somehow take advantage of that.

But herein lies a question about whether we want the opportunity for these kinds of optimizations. The assumption of "one feature name per every other feature name for this evaluation" that @FeatureFinder@ currently has is the very assumption that allows for these kinds of optimizations. If you remove that assumption, I think you might have to remove the existing optimizations and it might avoid this one. You would have to instead pick an order whereby features could snowball, like "first look up left dimension to right dimension, second look up left dimension to baseline dimension, third look up right dimension to baseline dimension, fourth look up baseline dimension to left dimension, fifth, look up.." and then pick a number of iterations after which you stop, because the first lookup can go from 1 to N, the second lookup can go from 1 to N, then the third lookup can go from 1 to N, etc.

Edit: or I guess you would just have a lookup or two for each declared feature instead of attempting to look at many features from one dimension to another.

epag commented 2 months ago

Original Redmine Comment Author Name: Hank (Hank) Original Date: 2021-10-27T15:30:36Z


Jesse,

I don't have time to think about this deeply, but, in general, this duplication does not appear to be a big thing. Thus, if it complicates the code to get rid of it, its probably not worth it. Rejection is perfectly reasonable for this ticket.

Thanks,

Hank

epag commented 2 months ago

Original Redmine Comment Author Name: James (James) Original Date: 2021-10-27T15:33:55Z


This is kind of what I wondered, the optimization might not be worth it even if it is literally a duplicate request. I suppose you could still implement optimization above the caller by caching responses, edit: per url request, that is (there is no reason that a response would change during an evaluation and, strictly, that would render the evaluation invalid anyway). Still, even though that is not an api change, it is also extra complexity.