stactools-packages / fws-nwi

stactools package for the National Wetlands Inventory (NWI) product provided by the U.S. Fish and Wildlife Service (FWS)
Other
1 stars 0 forks source link

Switch to geopandas #13

Closed gadomski closed 1 year ago

gadomski commented 1 year ago

Related Issue(s):

Description: A major refactor, simplifying the geoparquet creation via geopandas. Output item and collection structure doesn't change too much.

Also computes the Item's geometry by union-ing all of the shapefiles in the zipfiles.

I wouldn't be surprised if there's some edge cases that I've missed, but I'm going to rely on real-world exercising on the Planetary Computer test environment to shake those out. I'll feed those back in future PRs.

PR checklist:

gadomski commented 1 year ago

Per the table extension and MS preference, the types listed in table:columns should be parquet types, not pandas types.

Ah, thanks. Do you know of a function to map from one to the other?

How widely accepted is the "cloud-optimized" role?

There's been some discussion in the past (I think on gitter), my sense is that it's pretty unclear what "cloud-optimized" even means. I kept it there because I didn't have a compelling reason to remove and it's not causing any harm, but maybe it's confusing?

pjhartzell commented 1 year ago

Do you know of a function to map from one to the other?

I don't think there is an exact match. The pandas object type is ambiguous. It could be a string, or a categorical, maybe other things. If all your pandas object types are strings, though, you could set up a mapping. Or read the parquet back in and extract the types from the parquet schema at that point. I ended up using Tom's stac-table repo, which pulls types from the parquet schema, to generate the initial STAC Item for NCN.

but maybe it's confusing?

I don't find it confusing. I just am not super clear on the use of roles in the wild, so I default to less than more. If it is found undesirable, it can be removed after Item creation. Just wondering what your thoughts were on it.

gadomski commented 1 year ago

a) my understanding is correct

Yup, I think so.

b) should we be adding proj:bbox going forward?

I don't think its necessary, since (as you said) you can always derive it. I kept it just because it was there before and its not doing much harm.