zephyr-data-specs / GMNS

General Modeling Network Specification
https://zephyr-data-specs.github.io/GMNS/
104 stars 15 forks source link

Specification Clarification for node_id and zone_id in node.csv #83

Open asu-trans-ai-lab opened 1 month ago

asu-trans-ai-lab commented 1 month ago

We need to discuss the detailed specifications for node_id and zone_id in node.csv. Specifically, node_id should be an integer, as demonstrated in the standard sample data sets: https://github.com/zephyr-data-specs/GMNS/blob/main/examples/Lima/GMNS/node.csv

Currently, the specification lists the type as "any," which can cause confusion. Additionally, zone_id might be mistakenly treated as a string or float type, introducing unnecessary complexities for different packages.

See https://github.com/zephyr-data-specs/GMNS/blob/main/spec/node.schema.json

"fields": [ { "name": "node_id", "type": "any", "description": "Primary key", "constraints": { "required": true } },

This ambiguity could have downstream implications for from_node_id and to_node_id in the link schema: https://github.com/zephyr-data-specs/GMNS/blob/main/spec/link.schema.json To avoid these issues, we should standardize node_id and zone_id as integers in the specifications.

ssmith55 commented 1 month ago

In allowing node_id and link_id to be unique ints or strings, we were following the example of GTFS unique_ID. See ID in https://gtfs.org/schedule/reference/#field-types . An application can certainly further restrict these id fields to be integers. In any event, I would want us to be consistent across all of the ID fields (not just node and zone). Would be happy to discuss further. And, I suppose we could add a note for users to the effect that "some applications require these ID fields to be integers" We can also fork a developmental version where all of the IDs are large integers.

Thank you for the feedback - Scott