rOpenGov / fmi

Finnish Meteorological Institute open data API R client
Other
10 stars 7 forks source link

Field ParameterValue has a 4 UTF-8 characters whereas a maximum of 3 is allowed #23

Closed ilarischeinin closed 7 years ago

ilarischeinin commented 7 years ago

This query gives the following error:

library(fmi)

# a working apiKey needs to be filled in
apiKey <- ""

request <- FMIWFSRequest$new(apiKey=apiKey)
request$setParameters(
  request="getFeature",
  storedquery_id="fmi::observations::weather::daily::simple",
  fmisid=100971L,
  starttime="2016-10-13T00:00:00",
  endtime="2016-10-13T23:59:59"
)
client <- FMIWFSClient$new(request=request)
layers <- client$listLayers()

response <- client$getLayer(
  layer=layers[1L],
  crs="+proj=longlat +datum=WGS84",
  swapAxisOrder=TRUE,
  parameters=list(splitListFields=TRUE)
)
ogr2ogr -f GML  -splitlistfields /var/folders/_3/gy_1s85s4dj7039m1j863x1h0000gn/T//RtmpXdfRj5/filedfcc1128407e /var/folders/_3/gy_1s85s4dj7039m1j863x1h0000gn/T//RtmpXdfRj5/filedfcc69a218fe BsWfsElement
ERROR 1: Field ParameterValue has a 4 UTF-8 characters whereas a maximum of 3 is allowed
ERROR 1: Unable to write feature 1 from layer BsWfsElement.
ERROR 1: Terminating translation prematurely after failed
translation of layer BsWfsElement (use -skipfailures to skip errors)
Error in convertOGR(sourceFile = private$cachedResponseFile, layer = layer,  : 
  Conversion failed.

An otherwise exact same query works for most days, but fails for some, like 2016-10-13 above. Looking at the two tempfiles passed to ogr2ogr, it seems the problem is in the first one. I wonder if this is something we can do anything about, or if the problem is on FMI's side. Here are the contents of that file:

<?xml version="1.0" encoding="utf-8" ?>
<ogr:FeatureCollection
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://ogr.maptools.org/ filee4685a378a76.xsd"
     xmlns:ogr="http://ogr.maptools.org/"
     xmlns:gml="http://www.opengis.net/gml">
  <gml:boundedBy><gml:null>missing</gml:null></gml:boundedBy>

</ogr:FeatureCollection>

And here is the corresponding file for a day that does work (2016-10-14):

<?xml version="1.0" encoding="utf-8" ?>
<ogr:FeatureCollection
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://ogr.maptools.org/ filee43a5197ae0.xsd"
     xmlns:ogr="http://ogr.maptools.org/"
     xmlns:gml="http://www.opengis.net/gml">
  <gml:boundedBy>
    <gml:Box>
      <gml:coord><gml:X>24.94459</gml:X><gml:Y>60.17523</gml:Y></gml:coord>
      <gml:coord><gml:X>24.94459</gml:X><gml:Y>60.17523</gml:Y></gml:coord>
    </gml:Box>
  </gml:boundedBy>

  <gml:featureMember>
    <ogr:BsWfsElement fid="BsWfsElement.0">
      <ogr:geometryProperty><gml:Point srsName="EPSG:4258"><gml:coordinates>24.94459,60.17523</gml:coordinates></gml:Point></ogr:geometryProperty>
      <ogr:gml_id>BsWfsElement.1.1.1</ogr:gml_id>
      <ogr:Time>2016-10-14T00:00:00Z</ogr:Time>
      <ogr:ParameterName>rrday</ogr:ParameterName>
      <ogr:ParameterValue>0.3</ogr:ParameterValue>
    </ogr:BsWfsElement>
  </gml:featureMember>
  <gml:featureMember>
    <ogr:BsWfsElement fid="BsWfsElement.1">
      <ogr:geometryProperty><gml:Point srsName="EPSG:4258"><gml:coordinates>24.94459,60.17523</gml:coordinates></gml:Point></ogr:geometryProperty>
      <ogr:gml_id>BsWfsElement.1.1.2</ogr:gml_id>
      <ogr:Time>2016-10-14T00:00:00Z</ogr:Time>
      <ogr:ParameterName>tday</ogr:ParameterName>
      <ogr:ParameterValue>2.9</ogr:ParameterValue>
    </ogr:BsWfsElement>
  </gml:featureMember>
  <gml:featureMember>
    <ogr:BsWfsElement fid="BsWfsElement.2">
      <ogr:geometryProperty><gml:Point srsName="EPSG:4258"><gml:coordinates>24.94459,60.17523</gml:coordinates></gml:Point></ogr:geometryProperty>
      <ogr:gml_id>BsWfsElement.1.1.3</ogr:gml_id>
      <ogr:Time>2016-10-14T00:00:00Z</ogr:Time>
      <ogr:ParameterName>snow</ogr:ParameterName>
      <ogr:ParameterValue>-1</ogr:ParameterValue>
    </ogr:BsWfsElement>
  </gml:featureMember>
  <gml:featureMember>
    <ogr:BsWfsElement fid="BsWfsElement.3">
      <ogr:geometryProperty><gml:Point srsName="EPSG:4258"><gml:coordinates>24.94459,60.17523</gml:coordinates></gml:Point></ogr:geometryProperty>
      <ogr:gml_id>BsWfsElement.1.1.4</ogr:gml_id>
      <ogr:Time>2016-10-14T00:00:00Z</ogr:Time>
      <ogr:ParameterName>tmin</ogr:ParameterName>
      <ogr:ParameterValue>1.4</ogr:ParameterValue>
    </ogr:BsWfsElement>
  </gml:featureMember>
  <gml:featureMember>
    <ogr:BsWfsElement fid="BsWfsElement.4">
      <ogr:geometryProperty><gml:Point srsName="EPSG:4258"><gml:coordinates>24.94459,60.17523</gml:coordinates></gml:Point></ogr:geometryProperty>
      <ogr:gml_id>BsWfsElement.1.1.5</ogr:gml_id>
      <ogr:Time>2016-10-14T00:00:00Z</ogr:Time>
      <ogr:ParameterName>tmax</ogr:ParameterName>
      <ogr:ParameterValue>3.8</ogr:ParameterValue>
    </ogr:BsWfsElement>
  </gml:featureMember>
</ogr:FeatureCollection>

So, looking at this, it seems like the actual data is simply missing. However, the contents of the second file that is passed to ogr2ogr look fine. It appears completely analogous to the corresponding file for a date that works (2016-10-14), and has all the data it is supposed to have:

<?xml version="1.0" encoding="UTF-8"?>
<wfs:FeatureCollection
  timeStamp="2016-12-23T10:59:58Z"
  numberReturned="5"
  numberMatched="5"
      xmlns:wfs="http://www.opengis.net/wfs/2.0"
    xmlns:gml="http://www.opengis.net/gml/3.2"
    xmlns:BsWfs="http://xml.fmi.fi/schema/wfs/2.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.opengis.net/wfs/2.0 http://schemas.opengis.net/wfs/2.0/wfs.xsd
                        http://xml.fmi.fi/schema/wfs/2.0 http://xml.fmi.fi/schema/wfs/2.0/fmi_wfs_simplefeature.xsd"
>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.1">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.1" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>rrday</BsWfs:ParameterName>
                <BsWfs:ParameterValue>-1.0</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.2">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.2" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>tday</BsWfs:ParameterName>
                <BsWfs:ParameterValue>1.5</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.3">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.3" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>snow</BsWfs:ParameterName>
                <BsWfs:ParameterValue>NaN</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.4">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.4" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>tmin</BsWfs:ParameterName>
                <BsWfs:ParameterValue>0.0</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.5">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.5" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>tmax</BsWfs:ParameterName>
                <BsWfs:ParameterValue>3.4</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

</wfs:FeatureCollection>

Above I have only queried for individual days (because for another purpose I've setup a system that caches the results of daily queries). But one can also query ranges of multiple days, and then even the offending days work just fine. Out of the past 40 years, there seems to be 50 days that are problematic, when queried individually.

So, I'm not sure if this is something on our or FMI's side. And even if it is on our side, not sure if it's worth too much effort, as it can be circumvented by querying multiple days.

jlehtoma commented 7 years ago

Thanks for reporting this! I'll have a closer look at some point, but good to hear that the issue can be circumvented by querying multiple days.

ilarischeinin commented 7 years ago

I don't know why I didn't realize to originally check this, but: When you execute the R code above (the very first code chunk), which is for date 2016-10-13, this is the URL it fetches from:

http://data.fmi.fi/fmi-apikey/<apiKey>/wfs?request=getFeature&storedquery_id=fmi::observations::weather::daily::simple&fmisid=100971&starttime=2016-10-13T00:00:00&endtime=2016-10-13T23:59:59

Below is the reply from the server, and it looks fine to me. In fact, it looks identical in structure compared to a date for which the R code works fine, like 2016-10-14. The only differences are in timestamps and the actual measurement values. The measurements do sometimes look odd, such as -1.0 or NaN for snow depth, but to me it looks like these values work fine for some dates, so that shouldn't be the problem.

Anyways, this makes me think that the problem is indeed on our side and not FMI's.

<?xml version="1.0" encoding="UTF-8"?>
<wfs:FeatureCollection
  timeStamp="2017-01-18T16:56:30Z"
  numberReturned="5"
  numberMatched="5"
      xmlns:wfs="http://www.opengis.net/wfs/2.0"
    xmlns:gml="http://www.opengis.net/gml/3.2"
    xmlns:BsWfs="http://xml.fmi.fi/schema/wfs/2.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.opengis.net/wfs/2.0 http://schemas.opengis.net/wfs/2.0/wfs.xsd
                        http://xml.fmi.fi/schema/wfs/2.0 http://xml.fmi.fi/schema/wfs/2.0/fmi_wfs_simplefeature.xsd"
>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.1">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.1" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>rrday</BsWfs:ParameterName>
                <BsWfs:ParameterValue>-1.0</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.2">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.2" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>tday</BsWfs:ParameterName>
                <BsWfs:ParameterValue>1.5</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.3">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.3" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>snow</BsWfs:ParameterName>
                <BsWfs:ParameterValue>NaN</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.4">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.4" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>tmin</BsWfs:ParameterName>
                <BsWfs:ParameterValue>0.0</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

    <wfs:member>
            <BsWfs:BsWfsElement gml:id="BsWfsElement.1.1.5">
                <BsWfs:Location>
                    <gml:Point gml:id="BsWfsElementP.1.1.5" srsDimension="2" srsName="http://www.opengis.net/def/crs/EPSG/0/4258">
                        <gml:pos>60.17523 24.94459 </gml:pos>
                    </gml:Point>
                </BsWfs:Location>
                <BsWfs:Time>2016-10-13T00:00:00Z</BsWfs:Time>
                <BsWfs:ParameterName>tmax</BsWfs:ParameterName>
                <BsWfs:ParameterValue>3.4</BsWfs:ParameterValue>
            </BsWfs:BsWfsElement>
    </wfs:member>

</wfs:FeatureCollection>
jlehtoma commented 7 years ago

@ilarischeinin I had a quick look at this and was able to reproduce your original issue using ogr2ogr: 2016-10-13 does not work. However, I also tested it using the develop branch of rwfs which does away with the need to use ogr2ogr (see here for more details). With this version, the data seems to be parsed without errors. Here's the response data for 2016-10-13:

fid gml_id Time ParameterName ParameterValue
1 BsWfsElement.1.1.1 2016-10-13T00:00:00Z rrday -1.0
2 BsWfsElement.1.1.2 2016-10-13T00:00:00Z tday 1.5
3 BsWfsElement.1.1.3 2016-10-13T00:00:00Z snow NaN
4 BsWfsElement.1.1.4 2016-10-13T00:00:00Z tmin 0.0
5 BsWfsElement.1.1.5 2016-10-13T00:00:00Z tmax 3.4

The negative value and the NaN you mentioned are still there, but as you say, these probably aren't the issue. Here's the response data for 2016-10-14:

fid gml_id Time ParameterName ParameterValue
1 BsWfsElement.1.1.1 2016-10-14T00:00:00Z rrday 0.3
2 BsWfsElement.1.1.2 2016-10-14T00:00:00Z tday 2.9
3 BsWfsElement.1.1.3 2016-10-14T00:00:00Z snow -1.0
4 BsWfsElement.1.1.4 2016-10-14T00:00:00Z tmin 1.4
5 BsWfsElement.1.1.5 2016-10-14T00:00:00Z tmax 3.8

If this is indeed the case, then it rather seems an issue with ogr2ogr, or downloading the file or both or something else. Could you test this by installing the development version of rwfs by doing:

devtools::install_github("ropengov/rwfs, ref = "develop")

I should get this merged and released to CRAN...

ilarischeinin commented 7 years ago

I did that, and it indeed fixed all the problematic cases I had (a total of 50 out of daily queries since 1959-01-01 for one FMI station).

Thank you!

I'll close this issue as the problem is not within the fmi package.

jlehtoma commented 7 years ago

@ilarischeinin , quick question: what platform are you on? Could you also post here the output of the following command:

library(rgdal)

Trying to figure out this issue with rwfs

ilarischeinin commented 7 years ago

OS X (Yosemite), and if that matters, gdal is installed via MacPorts (@2.1.2_1+expat).

library(rgdal)
Loading required package: sp
rgdal: version: 1.2-5, (SVN revision 648)
 Geospatial Data Abstraction Library extensions to R successfully loaded
 Loaded GDAL runtime: GDAL 2.1.2, released 2016/10/24
 Path to GDAL shared files: 
 Loaded PROJ.4 runtime: Rel. 4.9.1, 04 March 2015, [PJ_VERSION: 491]
 Path to PROJ.4 shared files: (autodetected)
WARNING: no proj_defs.dat in PROJ.4 shared files
 Linking to sp version: 1.2-3 
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X Yosemite 10.10.5

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rgdal_1.2-5 sp_1.2-4   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.9          lattice_0.20-34      codetools_0.2-15     listenv_0.6.0        future_1.2.0         assertthat_0.1       digest_0.6.11        grid_3.3.2           plyr_1.8.4          
[10] gtable_0.2.0         stats4_3.3.2         StanHeaders_2.14.0-1 scales_0.4.1         ggplot2_2.2.1        lazyeval_0.2.0       tools_3.3.2          munsell_0.4.3        rstan_2.14.1        
[19] parallel_3.3.2       inline_0.3.14        colorspace_1.3-2     globals_0.8.0        tibble_1.2           gridExtra_2.2.1     
jlehtoma commented 7 years ago

Great, thanks!

jlehtoma commented 7 years ago

@ilarischeinin just to be sure, could you too paste the output of the following command here:

library(rgdal)
ogrDrivers()$name
ilarischeinin commented 7 years ago

Sure, here you go:

 [1] "AeronavFAA"     "AmigoCloud"     "ARCGEN"         "AVCBin"        
 [5] "AVCE00"         "BNA"            "Carto"          "Cloudant"      
 [9] "CouchDB"        "CSV"            "CSW"            "DGN"           
[13] "DXF"            "EDIGEO"         "ElasticSearch"  "ESRI Shapefile"
[17] "Geoconcept"     "GeoJSON"        "GeoRSS"         "GFT"           
[21] "GML"            "GPKG"           "GPSBabel"       "GPSTrackMaker" 
[25] "GPX"            "HTF"            "HTTP"           "Idrisi"        
[29] "JML"            "KML"            "MapInfo File"   "Memory"        
[33] "netCDF"         "ODS"            "OGR_GMT"        "OGR_PDS"       
[37] "OGR_SDTS"       "OGR_VRT"        "OpenAir"        "OpenFileGDB"   
[41] "OSM"            "PCIDSK"         "PDF"            "PGDUMP"        
[45] "PLSCENES"       "REC"            "S57"            "SEGUKOOA"      
[49] "SEGY"           "Selafin"        "SQLite"         "SUA"           
[53] "SVG"            "SXF"            "TIGER"          "UK .NTF"       
[57] "VDV"            "VFK"            "WAsP"           "WFS"           
[61] "XLSX"           "XPlane"        
jlehtoma commented 7 years ago

Thanks! So at least installed via MacPorts (@2.1.2_1+expat), the WFS driver is enabled.

ilarischeinin commented 7 years ago

I'm not sure if these are useful pieces of info, but: proj (@4.9.3_0) and geos (@3.6.1_0) were also installed via MacPorts, and my rgdal is from CRAN.