CityScope / CS_choiceModels

An API for prediction of mobility choices based on personal and land use characteristics. The predictions use discrete choice models.
https://cityscope.media.mit.edu/CS_choiceModels/
3 stars 1 forks source link

Format standardization #2

Closed agrignard closed 5 years ago

agrignard commented 5 years ago

From what I see the format is driven by what the table is giving you

If I understand well this is more or less the type of block that you expose "type":{"0":"P","1":"W","2":"L","3":"G","4":"Z"}}}

Why not however I think that this list should come from your standard (the choice model) and not being driven by a table. For instance imagine if you have to link this to Volpe model that use a different type (RL,RM,RS,,OS,OL,OM). What can be useful is that you define the kind of type that a table should expose if it wants to be compatible with your model choice.

Do you have a list of type of building in your original model?

I am not sure to knwo what P,W,L,G and Z stand for ?

RELNO commented 5 years ago

@agrignard I agree -- table settings [i,e mapping object in cityIO JSON] should be derived by use case.

But it's actually a bigger question, now that we're implementing unique tags: at least in CityScopeJS tables, there could be T1, T2...Tn types. Is the idea to strictly predefine them like the fixed set of types in Volpe v.1?

yasushisakai commented 5 years ago

the current specification of cityio does not limit the types into one standard. At least it may be useful to be verbose inside the header to communicate what it is. (The json data does not point to the wiki for clarification) That said, RL(is it Real Lover?) is equally arbitrary to P(is it Panera Bread?)

RELNO commented 5 years ago

@yasushisakai {"0":"Phil","1":"Walid","2":"Luis","3":"Grignard","4":"Zhang"}

LAAP commented 5 years ago

Hi Guys.

I see the issue here. The question is: are this "surnames" really necessary? If not, personally, I think that the "surnames" P,W,L,G,Z, etc. , RL,RM,RS,OS,OL,OM, etc. or T1,T2,T3, etc. are "redundant". This "surnames" are User's interpretation of the data inside of the grid, and, in my opinion, it shouldn't be sent by the table, because each project will be different.

As far as I know, in the Volpe table, it is RL,RM,RS,OS,OL,OM... because there are representing "urban units of 22x22m", In CityScopeJS it maybe is a T1, T2, T3.... Because it will be the Type of land use or similar... etc. So, probably in each place/collaboration/table, we will be changing the surname, and nobody will agree on the previous one.

With all that in mind, what if we stick with the simplest and generic option: Just the numbers. Maybe something like:

"Output":{"0","1","2","3","4"}}}

Probably I am proposing something that doesn't make sense, so please, educate me a bit.

Thanks for understanding!

Un abrazo

agrignard commented 5 years ago

With unique tags for sure its redundant, it's not even redundant it's a total duplication of information. However for many project unique tags would be too much (I am not even speaking about the cost in term of time to build a table with unique tags), for the user already having the choice between 10 types of blocks it's already a lot. So in that case the format agreed on 2.1 give the possibility to define a mapping (see https://github.com/CityScope/CS_CityIO_Backend/wiki/Data-Format#mapping-dictionary) which maybe could be an optional field because indeed it s not mandatory to explain what your data is looking (e.g I could totally twik the volpe table and consider that RL is a whatever building in a GAMA simulation).

This discussion is more linked to data format, what I am saying in the initial issue is that @doorleyr should be the one "imposing" the kind of block his simulation needs in order to work. If I give a table with only Pierre, Paul and Jack I guess the model choice won't work.

RELNO commented 5 years ago

+1 for mapping as optional field.

Cost? Aren't we FOSS?

LAAP commented 5 years ago

OK,

What about using numbers with numbers. the "surnames" can be just numbers from -2, to "n"? As an example:

"Output":{"0:-2","1:-2","2:4","3:5","4:1","5:1","6:1","7:1","8:-1","9:2","10:3","11:0"}}}

RELNO commented 5 years ago

"-2" (or any other form of key/value) is arbitrary. If an endpoint user need such mapping to occur, they can remap locally.

LAAP commented 5 years ago

I was using the IDs from https://github.com/CityScope/CS_CityIO_Backend/wiki/Data-Mapping-of-Id-and-types, but not sure to understand you @RELNO .

What I am trying to say is "let's be as general as possible in the reading (at the scanning)", and then, let each table to interpret the grid. For a table focused on mobility will be different meaning that for a table focused on Urban planning, but the tags can be the same for all the tables, so the reading is generic. Does it make any sense for all of you or I am totally lost?

RELNO commented 5 years ago

You're suggesting: Output":{"0:-2","1:-2","2:4","3:5","4:1","5:1","6:1","7:1","8:-1","9:2","10:3","11:0"}}} This implies that the scanning part is forcing some kind of mapping which is not generic (why - 2?) The data mapping you've used was one use case which might only be relevant for Volpe. My point is that in practice, no mapping should occur or be imposed (what @agrignard called optional), so that the table will send only a long list of integers that correspond to the list of types e.g: 0,1,5,3,3...

doorleyr commented 5 years ago

Regarding this module specifically, right now it's just looking at 'Work' places and 'Living' places. I did it this way as a simple first iteration but ultimately it may use more categories (eg. Small Medium Large) or unique tags.

Since this module is a client of the grid API, it can't force a table to have the information it needs. Just like the client-side GAMA or GH wont work if it reads from a grid that doesn't have the right types and also can't force any table to have the right types. We need to just agree together what should be the types and then coordinate to ensure that the table this module is reading from has the right types.

RELNO commented 5 years ago

closing as this is not related here and since data format has changed since opened.