heremaps / here-cli

A command-line interface to work with HERE XYZ Hub and other HERE APIs
https://www.here.xyz/
MIT License
38 stars 19 forks source link

upload -i does not assign feature IDs to geojson files #210

Open burritojustice opened 4 years ago

burritojustice commented 4 years ago

upload -i assigns a feature ID with CSVs, but I cannot get it to work on geojson files. (Could have sworn this worked before?)

burritojustice commented 4 years ago

as a reminder, upload -o does respect existing features IDs

burritojustice commented 4 years ago

This may be an opportunity to clarify how feature IDs get assigned by XYZ, and why we have these options, which could be clarified because even I get confused.

By default, if you upload a feature to the XYZ API, and it has a feature ID, XYZ will save it with that feature ID.

If you upload another feature with that same ID, XYZ will replace the first feature with the second. This is not bad if the feature IDs are well managed and actually unique, but it is a problem if the feature IDs are not well managed and actually unique.

This is often the case with data that once resided in ArcGIS systems, which often give features in a dataset incremental integer IDs:

This is bad.

This is why by default, the CLI creates a hash based on the properties of a feature, and uses it as the feature ID. (This is also how we detect duplicates.)

However, with -o we can override this hash, and use existing feature IDs on well managed datasets where the IDs are in fact unique.

There may also be cases where we want to accept duplicate IDs, which is why we have -u.

We have -i because sometimes we want to take a property and assign it as a feature ID.

(Right now, -i does not seem to be working for GeoJSON files, just CSVs.)

There may be cases where there is a unique feature ID, but there are multiple values for that ID. A good example of this is election results where the unique thing in a csv (an electoral district) has multiple values (votes for multiple candidates).

There is an interesting opportunity to help users by patching the unique feature with these multiple values.

This may be an opportunity to clarify how feature IDs get assigned by XYZ, and why we have these options, which could be clarified because even I get confused.

By default, if you upload a feature to the XYZ API, and it has a feature ID, XYZ will save it with that feature ID.

If you upload another feature with that same ID, XYZ will replace the first feature with the second. This is not bad if the feature IDs are well managed and actually unique, but it is a problem if the feature IDs are not well managed and actually unique.

This is often the case with data that once resided in ArcGIS systems, which often give features in a dataset incremental integer IDs:

This is bad.

This is why by default, the CLI creates a hash based on the properties of a feature, and uses it as the feature ID. (This is also how we detect duplicates.)

However, with -o we can override this hash, and use existing feature IDs on well managed datasets where the IDs are in fact unique.

There may also be cases where we want to accept duplicate IDs, which is why we have -u.

We have -i because sometimes we want to take a property and assign it as a feature ID.

(Right now, -i does not seem to be working for GeoJSON files, just CSVs.)

There may be cases where there is a unique feature ID, but there are multiple values for that ID. A good example of this is election results where the unique thing in a csv (an electoral district) has multiple values (votes for multiple candidates). You often see election result CSVs that look like this:

electoral_district,candidate,party,votes,percentage
1,Smith,Red,600,60%
1,Jones,Blue,400,40%
2,Miller,Red,100,20%
2,Baker,Blue,400,80%
3,Jackson,Red,100,10%
3,Bleigh,Blue,400,40%
3,Graham,Green,500,50%

It would be nice to be able to patch an existing feature with these additional values as new objects.You'd need to pick a property key to roll things up by, but say you chose party

(below is a WIP)


{
feature:
  district: 
  {
    id: 1,
    party: 
      {
        id: red,
        candidate: Smith,
        votes: 600,
        percentage: 60%
      },
      {
        id: blue,
        candidate: Jones,
        votes: 400,
        percentage: 40%
      },
}

Discussed this further here:
https://github.com/heremaps/here-cli/issues/205