This PR adds an optional dtype argument to imod.prepare.celltable function. It furthermore refactors the test for spatial a bit to include test cases for shapefiles with floats as well as integers.
In theory it is also possible to infer the dtype based on the provided columnname, by opening the file without loading all data in memory, and inferring dtype of that specific column in the attribute table. However, from what I've read, this would have required quite some extra code, as the more low level gdal or ogr packages would have to be used. The packages to work easily with vector data (geopandas, fiona) load everything in memory upon opening files, as far as I understand. This would defeat the purpose for this function, designed for good performance on large dataset. Having to snoop dtypes based on column names would thus require extra code, which I foresee to be complex. Besides, for most usecases integer dtype is fine. Less complexity beats ease of use here, I reckon.
Checklist
[x] Links to correct issue
[x] Update changelog, if changes affect users
[x] PR title starts with Issue #nr, e.g. Issue #737
Fixes #978
Description
This PR adds an optional dtype argument to
imod.prepare.celltable
function. It furthermore refactors the test for spatial a bit to include test cases for shapefiles with floats as well as integers.In theory it is also possible to infer the dtype based on the provided columnname, by opening the file without loading all data in memory, and inferring dtype of that specific column in the attribute table. However, from what I've read, this would have required quite some extra code, as the more low level
gdal
orogr
packages would have to be used. The packages to work easily with vector data (geopandas, fiona) load everything in memory upon opening files, as far as I understand. This would defeat the purpose for this function, designed for good performance on large dataset. Having to snoop dtypes based on column names would thus require extra code, which I foresee to be complex. Besides, for most usecases integer dtype is fine. Less complexity beats ease of use here, I reckon.Checklist
Issue #nr
, e.g.Issue #737