tmrudick / mta-turnstiles

node module and command line tool to parse MTA turnstile data
Other
3 stars 4 forks source link

Issue with stations.json #5

Open whryan opened 7 years ago

whryan commented 7 years ago

It looks like the stations.json file isn't quite accurate? My apologies if I just misread this, I'm not amazing w/ Node, but it looks like that JSON uses the remote unit to match entries to station/line combinations.

The issue with this is that remote units aren't always unique to one station -- they sometimes encompass multiple stations, albeit generally multiple stations on the same line. So, technically you're arbitrarily choosing one station out of a few stations and just routing all the traffic through that station.

Is this an intentional choice? If so, it would probably be helpful to document it. If not, this might be something to look into -- I think it wouldn't be incredibly difficult to correct.

One suggested approach (this might be bad, but it seems logical to me):

I don't know node or javascript well enough to implement this myself, but am working on it in python. If I manage to get a more extensive/better dictionary I can let you know.

whryan commented 7 years ago

Quick update - you can actually get this mostly correct just by matching on line, unit, booth, and division. Then you could go ahead and fuzzy string match the last few, but that should be close enough already.

r-shekhar commented 7 years ago

@whryan Would you have a csv or json file of the results of the match? It would be super helpful to have a mapping for a subway turnstile I'm working on https://github.com/r-shekhar/NYC-transport