remulasce / metroapp

Realtime arrival notification utility for LA Metro
2 stars 0 forks source link

NexTrip scraper doesn't check for duplicate stops properly. #229

Closed remulasce closed 9 years ago

remulasce commented 9 years ago

Line 60 checks stopid against stoptag, which are different.

This results in duplicate entries for LA Metro Rail, which uses multiple stoptags per stopid on rail.

A stoptag is supposed to denote one single platform, whereas a stopid is like the whole station.

We want one entry per station, not platform. So we should compare against el1, I think. Right?

Eg. check lametro-rail.db, stopid 80301. There's two entries, differing only in the stoptag. In the Android app, this would result in two identical stops added to the servicerequest, with the same stopids. I think.

nighelles commented 9 years ago

el[0] should be stopID, taken from the website. el[1] is just the name of the stop. Is it giving multiple entries per stopID?

remulasce commented 9 years ago

databaselist.append( (uniquetag,stopid,stopname,lat,lon) )

remulasce commented 9 years ago

And, yes.

remulasce commented 9 years ago

And correction to proposed fix: Don't add uniquetag any more, since we're not using it.

nighelles commented 9 years ago

Unique tag is required because the database needs a primary index term that's totally unique, and some stops don't have stopIDs. Now granted, at the moment I'm ignoring those, but it will matter when we add those.

nighelles commented 9 years ago

Fixed code, will update new databases.

remulasce commented 9 years ago

SQL lib on Android auto-generated the unque index for me. Might want to do that, since I'm not sure if NexTrip can be trusted to generate unique names like that.

nighelles commented 9 years ago

Ok, but I mean, like, the table actually needs to have a unique index to get saved via sqlite3. So like, I may be able to use a lib to add one in the app, but the sqlite3 engine in python won't save unless there's a primary index. And it has to be an INTEGER. So, since stopIDs are text, I used t hat.

remulasce commented 9 years ago

Right, see if the Python / whatever lib can generate one for you.

nighelles commented 9 years ago

Ok, so you just want like, an arbitrary index number. I can change that too.

remulasce commented 9 years ago

http://stackoverflow.com/questions/9342249/how-to-insert-a-unique-id-into-each-sqlite-row

sqlite should be able to do it for you.

remulasce commented 9 years ago

Looks like this should do it: CREATE TABLE <...> ( Id INTEGER PRIMARY KEY, stuff datatype, ... )

And you can just pretend the id doesn't exist thereafter, and sqlite will deal with it for you. (insert statements don't need a field for it. Just pretend it isn't there.)

nighelles commented 9 years ago

Ok, I'll look into if that actually works.

remulasce commented 9 years ago

Fixed by Nighelles.

remulasce commented 9 years ago

Future note: You don't have to specify the primary key as a column at all.

Sqlite will create one for you, and autoincrement it, if you don't specify one.