Closed SmithB closed 4 months ago
@SmithB, I was able to recreate the problem and confirm what you are seeing. Specifically, when the geoparquet option is enabled (which is the default), the geometry column contains duplicated x,y coordinates. When the parquet option is enabled (which is done by setting as_geo
to False
), then the problem goes away.
Upon investigation, the issue is due to a bug in the way the latitude and longitude fields are decoded when they are used for the geometry column. The fields were not being appropriately identified as "batch" fields and therefore only the first record in each batch (~256 elevations) was having its latitude and longitude read, and those values were being applied to the rest of the batch.
This bug has been fixed with commit e94161ae. The code now correctly reads each latitude and longitude.
When I save sliderule output to a local file and enable the 'open_on_complete' option, the dataframe h_mean field is scrambled relative to the geometry field.
region=[{'lat': 69.95536798500007, 'lon': -27.338821302999975}, {'lat': 69.96134097100008, 'lon': -27.41378932599997}, {'lat': 70.00485198100006, 'lon': -27.94445625999998}, {'lat': 70.04489099100005, 'lon': -28.430339370999945}, {'lat': 70.04682093500003, 'lon': -28.437871333999965}, {'lat': 70.05568600200007, 'lon': -28.46451123199995}, {'lat': 70.05699893500008, 'lon': -28.467971424999973}, {'lat': 70.37890593200007, 'lon': -29.07789428799998}, {'lat': 70.38314796600008, 'lon': -29.08488241799995}, {'lat': 70.38548999600005, 'lon': -29.08836141599994}, {'lat': 70.41485498300005, 'lon': -29.109849251999947}, {'lat': 70.41633599600004, 'lon': -29.110683267999946}, {'lat': 70.43761400800008, 'lon': -29.121194383999978}, {'lat': 72.05856300000005, 'lon': -28.842372458999932}, {'lat': 72.05995099800003, 'lon': -28.841072312999984}, {'lat': 72.06101199000005, 'lon': -28.83889329799996}, {'lat': 72.09917392500006, 'lon': -28.661121289999983}, {'lat': 72.10018098500007, 'lon': -28.642110444999958}, {'lat': 71.36114495400005, 'lon': -24.69513343099993}, {'lat': 71.34807493300008, 'lon': -24.635889265999936}, {'lat': 70.85243200600007, 'lon': -22.429544190999934}, {'lat': 70.85201200100005, 'lon': -22.42873307399998}, {'lat': 70.44017048600006, 'lon': -21.66382393799995}, {'lat': 70.15249599600008, 'lon': -22.067161032999934}, {'lat': 70.13031699800007, 'lon': -22.225189091999937}, {'lat': 70.12006296400006, 'lon': -22.29837198299998}, {'lat': 70.10610897400005, 'lon': -22.449150016999965}, {'lat': 70.09590900600006, 'lon': -22.56953808299994}, {'lat': 70.08132897500008, 'lon': -22.758966229999942}, {'lat': 69.95536798500007, 'lon': -27.338821302999975}]
parms = { "poly": region, "srt": icesat2.SRT_LAND, "cnf": icesat2.CNF_SURFACE_LOW, "ats": 10.0, "cnt": 10, "len": 40.0, "res": 20.0, "maxi": 6, } output_dict={"path":os.path.join(os.getcwd(), 'Scoresby.parquet'), "format":"parquet", "open_on_complete":True}
output to file (comment this out to run without saving to geoparquet
parms['output']=output_dict
atl06_sr = icesat2.atl06p(parms)
atl06_sr['longitude']=np.array(atl06_sr.geometry.x) atl06_sr['latitude']=np.array(atl06_sr.geometry.y)
plt.figure() plt.scatter(atl06_sr['longitude'][::20], atl06_sr['latitude'][::20],2, c=atl06_sr['h_mean'][::20], vmin=40, vmax=1000)
results with save enabled:
results with save disabled [ this is what I expect the results to look like]: