GeoDaCenter / spatial_access

https://spatial.uchicago.edu
37 stars 11 forks source link

add default category weight list to AccessModel #49

Closed gyoliver closed 5 years ago

gyoliver commented 5 years ago

Add the ability for users to specify a default category weight list to AccessModel. Julia agreed that this is useful functionality.

@ifarah Would you prefer that the user has to specify the default list whenever calculating access scores, or on top of the ability to set a default list when running the code, would it also be useful to have it hard-coded into the module (i.e., in Configs.py) so the users doesn't have to set a default list each time? I believe category weights aren't relevant to AccessCount, AccessSum, AccessTime, right?

ifarah commented 5 years ago

Indeed, category weights are not relevant in AccessCount, AccessSum, AccessTime. The default list should be hard-coded so the user doesn't have to specify it. You mean the [1,1,1,1,1..] right?

gyoliver commented 5 years ago

Yup, was referring to the [1,1,1,1,1,1,1,1,1,1].

On Wed, Apr 17, 2019 at 10:16 PM Irene Farah notifications@github.com wrote:

Indeed, category weights are not relevant in AccessCount, AccessSum, AccessTime. The default list should be hard-coded so the user doesn't have to specify it. You mean the [1,1,1,1,1..] right?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484340469, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVV2CCXHWKGISLW5LN3PQ7RZVANCNFSM4HGY3GWQ .

ifarah commented 5 years ago

Yeah, I think it should be hard coded if not specified by the user.

On Apr 17, 2019, at 11:21 PM, George Oliver notifications@github.com<mailto:notifications@github.com> wrote:

Yup, was referring to the [1,1,1,1,1,1,1,1,1,1].

On Wed, Apr 17, 2019 at 10:16 PM Irene Farah notifications@github.com<mailto:notifications@github.com> wrote:

Indeed, category weights are not relevant in AccessCount, AccessSum, AccessTime. The default list should be hard-coded so the user doesn't have to specify it. You mean the [1,1,1,1,1..] right?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484340469, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVV2CCXHWKGISLW5LN3PQ7RZVANCNFSM4HGY3GWQ .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484350666, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFMFR44TQD6TCK4IN4LVSCDPQ7ZODANCNFSM4HGY3GWQ.

jkoschinsky commented 5 years ago

Agreed

On Thu, Apr 18, 2019, 1:36 AM Irene Farah notifications@github.com wrote:

Yeah, I think it should be hard coded if not specified by the user.

On Apr 17, 2019, at 11:21 PM, George Oliver <notifications@github.com mailto:notifications@github.com> wrote:

Yup, was referring to the [1,1,1,1,1,1,1,1,1,1].

On Wed, Apr 17, 2019 at 10:16 PM Irene Farah <notifications@github.com mailto:notifications@github.com> wrote:

Indeed, category weights are not relevant in AccessCount, AccessSum, AccessTime. The default list should be hard-coded so the user doesn't have to specify it. You mean the [1,1,1,1,1..] right?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484340469>,

or mute the thread < https://github.com/notifications/unsubscribe-auth/ADSGLVV2CCXHWKGISLW5LN3PQ7RZVANCNFSM4HGY3GWQ>

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484350666>, or mute the thread< https://github.com/notifications/unsubscribe-auth/AFMFR44TQD6TCK4IN4LVSCDPQ7ZODANCNFSM4HGY3GWQ>.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484374962, or mute the thread https://github.com/notifications/unsubscribe-auth/ADILY72AU7X3WJJCTHM4MJLPRAJEXANCNFSM4HGY3GWQ .

gyoliver commented 5 years ago

Logan: just for planning purposes, do you have an idea when you might be able get to this? No rush. I realize you're a full-time student :)

On Thu, Apr 18, 2019 at 7:33 AM jkoschinsky notifications@github.com wrote:

Agreed

On Thu, Apr 18, 2019, 1:36 AM Irene Farah notifications@github.com wrote:

Yeah, I think it should be hard coded if not specified by the user.

On Apr 17, 2019, at 11:21 PM, George Oliver <notifications@github.com mailto:notifications@github.com> wrote:

Yup, was referring to the [1,1,1,1,1,1,1,1,1,1].

On Wed, Apr 17, 2019 at 10:16 PM Irene Farah <notifications@github.com mailto:notifications@github.com> wrote:

Indeed, category weights are not relevant in AccessCount, AccessSum, AccessTime. The default list should be hard-coded so the user doesn't have to specify it. You mean the [1,1,1,1,1..] right?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <

https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484340469 ,

or mute the thread <

https://github.com/notifications/unsubscribe-auth/ADSGLVV2CCXHWKGISLW5LN3PQ7RZVANCNFSM4HGY3GWQ

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<

https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484350666 , or mute the thread<

https://github.com/notifications/unsubscribe-auth/AFMFR44TQD6TCK4IN4LVSCDPQ7ZODANCNFSM4HGY3GWQ .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub < https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484374962 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ADILY72AU7X3WJJCTHM4MJLPRAJEXANCNFSM4HGY3GWQ

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484486558, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVSIINQEPRQF3GIPGKDPRBTADANCNFSM4HGY3GWQ .

lmnoel commented 5 years ago

shooting for tomorrow @gyoliver

gyoliver commented 5 years ago

Great!

On Thu, Apr 18, 2019 at 3:06 PM Logan Noel notifications@github.com wrote:

shooting for tomorrow @gyoliver https://github.com/gyoliver

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484669766, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVTOUUVY5JRU3SFB7J3PRDICRANCNFSM4HGY3GWQ .

lmnoel commented 5 years ago

Ok looking at this a bit further, I'm not sure I understand the point of having a default of [1,1,1,1,1...]. Given the "capacity" variable was removed from this model, that would mean that only the decayed time would contribute to the score for each category. I'm not sure I can wrap my head around that.

gyoliver commented 5 years ago

AccessScore uses a dictionary whose keys are destination category field values and whose values are lists of integers. These integers are used to weight the n'th facility of the same type when calculating the access score. In the contracts repo, we had a default weight list containing ten 1s, meaning the first ten facilities of the same type would be given the same weight (and any subsequent facility was given a weight of 0) before inverse weighting by network distance from the origin. This default weight list was applied to each destination category. You can see this in contracts/analytics/CommuniAnalytics.py line 338 (DIMINISH_WEIGHTS = [1,1,1, 1,1,1,1,1,1,1]). The custom_weight_dict parameter to the enclosing function could be used if the user wanted to apply different weightings per category.

As I understand it currently, the user of spatial_access has to provide a category_weight_dict argument that has a list of weight values per category in the destination dataset's category field, and if there isn't an exact match between the list of categories in the data and in the dictionary, the code throws an error. We think that a sizeable majority of users will probably want to apply a default weight list, but there may be exceptions depending on the category. Basically, the idea is to resurrect how things were coded in contracts if possible, with some tweaks.

I think maybe the best way to handle this (without having thought about the difficulty of implementation) is to include the default weight list [1,1,1,1,1,1,1,1,1,1] in Configs.py. Then also allow the user to supply weight lists for any exception categories, but use the default list for any category not found in the custom weight dict. So if the user supplied the dict {"Hospitals": [1, 0.67, 0.33], "Clinics": [1, 0.8, 0.6, 0.4, 0.2]}, the code would use these two lists for Hospitals and Clinics destination records, respectively, and [1,1,1,1,1,1,1,1,1,1] for other destination categories.

However, there may be users who want to use different default weight lists for different runs--it'd be a pain if they had to crack open the code to change the hard-coded default each time. So it'd also be nice to allow the user to specify a default weight list to be used in place of the list in Configs.py. For example, if a user thinks that each subsquent facility of the same type should really be weighted less for a particular analysis they're doing, and they have a particular weighting scheme for the category "Hospital", they might provide a custom_weight_dict argument of {"DEFAULT_WEIGHT_LIST": [1, 0.8, 0.6, 0.4, 0.2], "Hospital": [1, 0.67, 0.33]}.

Irene -- does this sound right?

Logan -- Let me know if something doesn't make sense or if you foresee difficulty implementing this.

George

On Thu, Apr 18, 2019 at 5:52 PM Logan Noel notifications@github.com wrote:

Ok looking at this a bit further, I'm not sure I understand the point of having a default of [1,1,1,1,1...]. Given the "capacity" variable was removed from this model, that would mean that only the decayed time would contribute to the score for each category. I'm not sure I can wrap my head around that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484715385, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVXKE6VUFFC6UHZT7KLPRD3SVANCNFSM4HGY3GWQ .

lmnoel commented 5 years ago

No it's not about the difficulty of implement, it's about making it intuitive to use--and from my mind, having an arbitrary number of [1's] doesn't seem like an intuitive default. I guess I don't have the best perspective on that though.

gyoliver commented 5 years ago

Yeah, I'm pretty sure using 10 ones instead of 8 or 5 is partially arbitrary. I'll have to let Irene explain why 10 1s was chosen.

George

On Thu, Apr 18, 2019 at 6:42 PM Logan Noel notifications@github.com wrote:

No it's not about the difficulty of implement, it's about making it intuitive to use--and from my mind, having an arbitrary number of [1's] doesn't seem like an intuitive default. I guess I don't have the best perspective on that though.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484724337, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVXGD4QVGHF6ZZ2WYYLPREBOPANCNFSM4HGY3GWQ .

ifarah commented 5 years ago

The "capacity" variable indeed is out and we're talking about "category_weight_dict". As I understand from your code, now the user has to mandatory write out a dictionary of weights in the "calculate" function. Before, there was a default of "None". We had implemented a default in the case that a user had 200 amenities and you don't want to write out weights for each destination. Also, when users want to weigh each category the same. Of course, 10 1's is arbitrary, but we thought 10 was a good enough number of elements to include in the hard-coded version. Maybe it could be more? That would work too. What do you think?

lmnoel commented 5 years ago

I guess I assumed that the most sensible default would be not "a list of 1's" but rather, not to decay on category at all.

ifarah commented 5 years ago

Yes. That makes perfect sense. What happens if I have "schools", "bookstores", "restaurants", but only specify the weights for schools and bookstores? Would the restaurants category not be decayed?

lmnoel commented 5 years ago

Is that the intended/desired behavior?

gyoliver commented 5 years ago

I could imagine a case where somebody uses the code to, for example, measure access to commercial opportunities within a mile. There could be dozens of businesses within that mile network radius and the user wouldn't want a decay based on the count of businesses to be applied. Using an "indefinitely long list of 1s" seems like a valid default. Dropping the weight to zero would seem confusing and arbitrary to a user doing this sort of analysis.

On Fri, Apr 19, 2019 at 10:12 AM Irene Farah notifications@github.com wrote:

Yes. That makes perfect sense. What happens if I have "schools", "bookstores", "restaurants", but only specify the weights for schools and bookstores? Would the restaurants category not be decayed?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484926158, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVTDXNJRUEG3EU4NZG3PRHOMFANCNFSM4HGY3GWQ .

lmnoel commented 5 years ago

By "not decaying on category" I mean effectively using an infinitely long list of 1s but not actually coding it that way. Just not using any weight, which does the same thing.

ifarah commented 5 years ago

It's happened to me before that I don't specify one of the categories in the weights, and I didn't realize. I think the default of infinitely list of 1s is great. It would just be great to know which categories are "defaulted" in the analysis.

lmnoel commented 5 years ago

ok so the workflow you want is something like this: don't specify any category weights: don't decay on category for any category specify a subset of category weights: don't decay on category for categories not in subset, do use weights for categories that are specified

OR

specify a subset of category weights: don't include scores for categories not in subset, do use weights for categories that are specified

ifarah commented 5 years ago

The first one: specify a subset of category weights: don't decay on category for categories not in subset, do use weights for categories that are specified

Letting the user know which categories are not included in subset.

lmnoel commented 5 years ago

Sure sure, will do

gyoliver commented 5 years ago

Irene -- do you think the user should be able to specify an alternative default weighting scheme?

On Fri, Apr 19, 2019 at 10:43 AM Logan Noel notifications@github.com wrote:

Sure sure, will do

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484936150, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVRKYF4N2JUULCK345LPRHSCBANCNFSM4HGY3GWQ .

lmnoel commented 5 years ago

That seams awfully convoluted for a user to understand

ifarah commented 5 years ago

Yeah, I think changing the default doesn't make sense, since the user can specify the weights herself. If they're really into it, they can go to the code too (:

gyoliver commented 5 years ago

Sounds good to me. That's definitely a call for a domain specialist to make.

On Fri, Apr 19, 2019 at 11:18 AM Irene Farah notifications@github.com wrote:

Yeah, I think changing the default doesn't make sense, since the user can specify the weights herself. If they're really into it, they can go to the code too (:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-484946560, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVWXYI4VSDNPA5RZR6DPRHWEVANCNFSM4HGY3GWQ .

gyoliver commented 5 years ago

Thanks for getting to this so quickly, Logan. Just want to check that I understand correctly -- to use the default weighting scheme, you pass an empty dictionary for the category_weight_dict parameter of AccessModel's calculate() method, correct?

George

On Fri, Apr 19, 2019 at 2:37 PM Logan Noel notifications@github.com wrote:

Closed #49 https://github.com/GeoDaCenter/spatial_access/issues/49.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#event-2288879300, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVSWR2HEOQFXCLTLWHTPRINPNANCNFSM4HGY3GWQ .

lmnoel commented 5 years ago

It’s an optional parameter, so you can either pass an empty dictionary or leave it blank.

On Mon, Apr 22, 2019 at 14:51 George Oliver notifications@github.com wrote:

Thanks for getting to this so quickly, Logan. Just want to check that I understand correctly -- to use the default weighting scheme, you pass an empty dictionary for the category_weight_dict parameter of AccessModel's calculate() method, correct?

George

On Fri, Apr 19, 2019 at 2:37 PM Logan Noel notifications@github.com wrote:

Closed #49 https://github.com/GeoDaCenter/spatial_access/issues/49.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/GeoDaCenter/spatial_access/issues/49#event-2288879300>, or mute the thread < https://github.com/notifications/unsubscribe-auth/ADSGLVSWR2HEOQFXCLTLWHTPRINPNANCNFSM4HGY3GWQ

.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-485529776, or mute the thread https://github.com/notifications/unsubscribe-auth/AE533KUFNKTC4GSRUD3VYYTPRYJKRANCNFSM4HGY3GWQ .

-- Logan Noel University of Chicago Class of 2019 Economics Major | Computer Science Minor lnoel@uchicago.edu

gyoliver commented 5 years ago

Thx!

On Mon, Apr 22, 2019 at 2:52 PM Logan Noel notifications@github.com wrote:

It’s an optional parameter, so you can either pass an empty dictionary or leave it blank.

On Mon, Apr 22, 2019 at 14:51 George Oliver notifications@github.com wrote:

Thanks for getting to this so quickly, Logan. Just want to check that I understand correctly -- to use the default weighting scheme, you pass an empty dictionary for the category_weight_dict parameter of AccessModel's calculate() method, correct?

George

On Fri, Apr 19, 2019 at 2:37 PM Logan Noel notifications@github.com wrote:

Closed #49 https://github.com/GeoDaCenter/spatial_access/issues/49.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/GeoDaCenter/spatial_access/issues/49#event-2288879300 , or mute the thread <

https://github.com/notifications/unsubscribe-auth/ADSGLVSWR2HEOQFXCLTLWHTPRINPNANCNFSM4HGY3GWQ

.

— You are receiving this because you modified the open/close state.

Reply to this email directly, view it on GitHub < https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-485529776 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AE533KUFNKTC4GSRUD3VYYTPRYJKRANCNFSM4HGY3GWQ

.

-- Logan Noel University of Chicago Class of 2019 Economics Major | Computer Science Minor lnoel@uchicago.edu

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GeoDaCenter/spatial_access/issues/49#issuecomment-485530212, or mute the thread https://github.com/notifications/unsubscribe-auth/ADSGLVUUCAIXDM7F3QWPDOLPRYJP7ANCNFSM4HGY3GWQ .