ArtesiaWater / hydropandas

Module for loading observation data into custom DataFrames
https://hydropandas.readthedocs.io
MIT License
56 stars 11 forks source link

BRO location with 2 filters: returns only first filter with read_bro #224

Closed ArtemisRo closed 4 months ago

ArtemisRo commented 4 months ago

I am encountering an issue with the hydropandas package when using the read_bro function for a groundwater observation data. This is my code:

oc = hpd.read_bro(extent=(132530, 132670, 456775, 457030), keep_all_obs=True) Hereby i get a dataframe of the groundwater observation (GMW000000005077_1) which is empty.

However, when I run the following code, I get data: gw_bro = hpd.GroundwaterObs.from_bro("GMW000000005077", 2) This correctly retrieves the data for well GMW000000005077 with filter number 2.

I would expect that running read_bro with the specified extent would retrieve data for any filter.

Can you please assist me in resolving this issue?

HMEUW commented 4 months ago

Interesting case. I started some analysis.

Larger extent To test if multiple filters are supported. I extented the extent to the north. Filters > 1 are present for an other location. It is interesting that the GMW000000005077 has tube_nr=1 included. The data is the data from tube_nr=2.

image

Detailed look at GMW000000005077 I downloaded the first and second filter seperate. Then it, the first filter has no data. gw_bro_tube1 = hpd.GroundwaterObs.from_bro("GMW000000005077", tube_nr=1) gw_bro_tube2 = hpd.GroundwaterObs.from_bro("GMW000000005077", tube_nr=2) BTW: both filters have the same top and bottom level.

So my first conclusion is that the first filter is not included because it has no data. But, that is strange. Because you included keep_all_obs=True. I don't know why that argument is not properly used in `bro.py'.

Second conclusion: is there some merge action somewhere? Because x, y and filterlevels of both filters have same property. And obs_collection holds only tube_nr=1 with obs of tube_nr=2. I cannot figure out at this moment.

image

dbrakenhoff commented 4 months ago

I'm not entirely sure why this happens but I think there is an issue in the BRO database where the number of existing tubes does not match the stored value under the GMW entry for this particular measurement location?

This is the raw response from BROLoket for this particular extent, which only includes one tube. I would expect both tube numbers to be listed here?

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<dispatchCharacteristicsResponse xmlns:xlink="http://www.w3.org/1999/xlink"
    xmlns:brocom="http://www.broservices.nl/xsd/brocommon/3.0"
    xmlns:gmwcommon="http://www.broservices.nl/xsd/gmwcommon/1.1"
    xmlns:gml="http://www.opengis.net/gml/3.2" xmlns="http://www.broservices.nl/xsd/dsgmw/1.1">
    <brocom:responseType>dispatch</brocom:responseType>
    <brocom:requestReference>-</brocom:requestReference>
    <brocom:dispatchTime>2024-07-09T13:48:17+02:00</brocom:dispatchTime>
    <numberOfDocuments>1</numberOfDocuments>
    <dispatchDocument>
        <GMW_C gml:id="BRO_0003">
            <brocom:broId>GMW000000005077</brocom:broId>
            <brocom:deregistered>nee</brocom:deregistered>
            <brocom:deliveryAccountableParty>30280353</brocom:deliveryAccountableParty>
            <brocom:qualityRegime>IMBRO/A</brocom:qualityRegime>
            <brocom:objectRegistrationTime>2019-03-27T15:32:40+01:00</brocom:objectRegistrationTime>
            <brocom:underReview>nee</brocom:underReview>
            <brocom:standardizedLocation srsName="urn:ogc:def:crs:EPSG::4258" gml:id="BRO_0001">
                <gml:pos>52.099919490 5.060168470</gml:pos>
            </brocom:standardizedLocation>
            <brocom:deliveredLocation srsName="urn:ogc:def:crs:EPSG::28992" gml:id="BRO_0002">
                <gml:pos>132592.000 456903.000</gml:pos>
            </brocom:deliveredLocation>
            <localVerticalReferencePoint codeSpace="urn:bro:gmw:LocalVerticalReferencePoint">NAP</localVerticalReferencePoint>
            <offset uom="m">0.000</offset>
            <verticalDatum codeSpace="urn:bro:gmw:VerticalDatum">NAP</verticalDatum>
            <groundLevelPosition uom="m">1.240</groundLevelPosition>
            <withPrehistory>nee</withPrehistory>
            <owner>30280353</owner>
            <constructionStandard codeSpace="urn:bro:gmw:ConstructionStandard">NEN5104</constructionStandard>
            <wellConstructionDate>
                <brocom:date>2002-08-02</brocom:date>
            </wellConstructionDate>
            <removed>nee</removed>
            <initialFunction codeSpace="urn:bro:gmw:InitialFunction">onbekend</initialFunction>
            <numberOfMonitoringTubes>1</numberOfMonitoringTubes>   # <================== Note the 1!!
            <wellHeadProtector codeSpace="urn:bro:gmw:WellHeadProtector">onbekend</wellHeadProtector>
            <nitgCode>B31H2697</nitgCode>
            <wellCode>GMW31H002697</wellCode>
            <statusOverview>
                <tubeStatus codeSpace="urn:bro:gmw:TubeStatus">gebruiksklaar</tubeStatus>
            </statusOverview>
            <diameterRange>
                <smallestTubeTopDiameter xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                    uom="mm" xsi:nil="true" />
                <largestTubeTopDiameter xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                    uom="mm" xsi:nil="true" />
            </diameterRange>
            <screenPositionRange>
                <shallowestScreenTopPosition uom="m">-4.850</shallowestScreenTopPosition>
                <deepestScreenBottomPosition uom="m">-5.850</deepestScreenBottomPosition>
            </screenPositionRange>
        </GMW_C>
    </dispatchDocument>
</dispatchCharacteristicsResponse>

Since only one tube is listed, hydropandas will only attempt to download the observations for one tube... Or we're missing some logic to tell hydropandas that there is only data for one tube, but that tube is not tube_nr=1.

EDIT: @HMEUW, just saw your post, thanks for diving in to this, seems like a measurement location was replaced? Or labelled as tube_nr 2 even though it should not have been?

HMEUW commented 4 months ago

EDIT: @HMEUW, just saw your post, thanks for diving in to this, seems like a measurement location was replaced? Or labelled as tube_nr 2 even though it should not have been?

Replacement can be. But, with exact the same filter levels? I suggest that @ArtemisRo contacts the owner of the location. It is the municipality of Utrecht (you can find that via KvK number, that is the deliveryAccountableParty tag). BTW: nice feature to rename the KvK numbers to real names ;).

OnnoEbbens commented 4 months ago

Thanks for pointing this out. All the vague errors are there because I assumed that if there is one filter it should be number 1.

I made a quick fix to solve this. I will optimize later

OnnoEbbens commented 4 months ago

To be clear, this code:

gw_bro_tube1 = hpd.GroundwaterObs.from_bro("GMW000000005077", tube_nr=1)
gw_bro_tube2 = hpd.GroundwaterObs.from_bro("GMW000000005077", tube_nr=2)

returned the same metadata for tube_nr=1 and tube_nr=2 because of a bug in hydropandas. In the BRO database tube_nr 1 does not exist. I don't think we have to contact the municipality for this.

In the PR I drafted (https://github.com/ArtesiaWater/hydropandas/pull/225) the behavior will change to:

>>> gw_bro_tube1 = hpd.GroundwaterObs.from_bro("GMW000000005077", tube_nr=1)
ValueError: gmw GMW000000005077 has no tube_nr 1 please choose a tube_nr from [2]

>>> gw_bro_tube2 = hpd.GroundwaterObs.from_bro("GMW000000005077", tube_nr=2)
>>> gw_bro_tube2

image

oc = hpd.read_bro(extent=(132530, 132670, 456775, 457030), keep_all_obs=True)

will return an ObsCollection with a single non-empty observation -> GMW000000005077_2

OnnoEbbens commented 4 months ago

Fixed in version 0.12.1