theCrag / website

theCrag.com: Add your voice and help guide the development of the world's largest collaborative rock climbing & bouldering platform
https://www.thecrag.com/
110 stars 8 forks source link

Stream event broken apart for no reason? Date splitting + BLC detection? #2079

Closed brendanheywood closed 3 years ago

brendanheywood commented 8 years ago

Ok so yesterday I did an epic gorge mapping recon trip and made a whole bunch of edits. I would expect them to all be in a single event for updates, and another single event for ticks, but it's been broken into a 6 different events and mixed with other random stuff:

Updates from the 21st: (date is wrong)

Brendan Heywood edited some topos, some areas and some routes at Gara Gorge. • 1920 • about an hour ago Created 10 routes and 14 topos and updated 30 areas, 24 routes and 4 topos. http://www.thecrag.com/event/748806873

This one has stuff from Gara and from Melbourne??:

http://www.thecrag.com/event/748896012

Updates from the 20th: http://www.thecrag.com/event/748645038 http://www.thecrag.com/event/748722861

I did all of these edit's last night and would expect them to all be a single event. Except for the melbourne stuff which should be a completely different event.

So I think there are a couple things going on:

The ticks event was also split: http://www.thecrag.com/event/748693380 (20th date is good) http://www.thecrag.com/event/748868535 (21st - date is wrong)

scd commented 8 years ago

I agree with your assessment of the problem.

I think the third party location to timezone perl module is a bit buggy especially out in the middle of nowhere. I think we can live with this as long as it is buggy in a consistent way. Currently it picks the centriod of the closest ancestor with a location. It should probably just pick the TLC.

I actually found the location to timezone problem way harder then I expected. There was not many native perl modules and the one I ended up choosing was a little buggy. I don't think perl will do this properly for us. The other alternative is to use an API service, but we should not pay. I am also worried about the lag in a fairly time critical part of the processing.

My ultimate backup here is to assign a timezone offset to nodes, and only use the algorithm if there is not one. I don't think daylight savings will be a problem.

Secondly it looks some of the events did not find the TLC. Bugger, this may be more difficult to track down.

task list:

brendanheywood commented 8 years ago

Storing a timezone at the TLC level seems pretty sane. Using the Google api https://developers.google.com/maps/documentation/timezone/usage-limits we could populate all the nodes in a couple days (2500 api calls per day) and almost never refresh them, maybe once a year

Even that is probably more fine grained than we need, in most cases just a country or one level down from country would be sufficient

The TLC problem smells to me like its relying on data / stat which isn't generated yet, it seems to happen with the nodes I just created, and not with updated data.

scd commented 8 years ago

I think we need to go down to one deeper than country

brendanheywood commented 8 years ago

More data for reference, I just got back from a second sneaky boulder mission and ticked a few more things, but ticks from today have been merged into the event from yesterday:

image

scd commented 8 years ago

Here are some notes on this investigation. I have not finished and have to go out, but @brendanheywood if anything triggers your thoughts please add to this.

TLC problem observations:

This is looking like it could be systematic. I am thinking that it is something to do with creating or updating topos. Or something to do with the route when drawing on the topo. SIMON TO LOOK INVESTIGATE DATA ON ACTUAL ACTIVITY LOG RECORDS.

I have audited the assignment of TLC and this comes directly from the database. There could be problems if either the node Level or AncestorCollector is set wrong. I think these should be set correctly. SIMON TO CHECK TO SEE IF IT IS POSSIBLE THAT REPARENTING COULD HAVE A TEMPORARY DATA ISSUE FOR LEVEL OR ANCESTORCOLLECTOR.

Also the timezone thing is tied to whatever TLC I discover. So the problem should be consistent. This makes me think that Gara Gorge has a bad timezone and you just happen to be working across the boundary. SIMON TO INVESTIGATE WHAT TIMEZONE IS ASSOCIATED WITH GARA GORGE.

brendanheywood commented 8 years ago

Just a random guess, but some topos I made the routes first and then linked, others I made the topo, then made the routes, then linked them. There maybe something to do with topos which don't have any routes linked, but again purely a guess from remembering what I did at the time.

Re timezone, can you expose this data, perhaps in the 'Admin for this area' page just so we can see it

scd commented 8 years ago

Timezone issue has been identified and fixed, but I think our reasoning about using a timezone setting associated with a country/state is reasonable (I will create a separate issue for that).

The problem was I had added 3 hours rather than subtracted 3 hours. So instead of the date ticking over at 3am it was ticking over at 9pm. Would you have done some edits before 9pm and some after?

BTW, Gara Gorge is giving an accurate +10hours timezone.

scd commented 8 years ago

I am unable to work out what the problem is with TLC. Auditing the code seemed fine (see below for two potential failure modes) and data associated with the activity logs and the nodes in question seemed fine. The TLC is coming from live database, not cached info.

New areas and routes should have their information instantly populated in the database so this should not be the route cause.

As you are aware the database does use redundant fields for performance reasons and the TLC does come from two of these redundant fields (AncestorCollector and node level). The true master data is NodeParent which derives NodeSequence, Nx, Level and AncestorCollector data.

The Level and AncestorCollector data may be out of sync on a node for a couple of minutes on reparenting. In this instance this could only make a difference if you were reparenting from oustide Gara Gorge to inside.

Toggling area TLC area type from Crag to Area and back again could cause similar TLC issues if you did some changes between toggling.

I don't think you reparented the route from outside Gara Gorge or toggled the TLC area type.

So there is something subtle going on here. I have added a logging line for activity.

I don't think I can do any more right now, so maybe we should just watch and observe for a while.

brendanheywood commented 8 years ago

9pm was definitely crossed so that makes perfect sense

On the tlc issue gara has a fairly nested crags within crags so maybe a blc vs tlc issue?

Sent from my iPhone

On 23/09/2015, at 2:32 PM, Simon Dale notifications@github.com wrote:

Timezone issue has been identified and fixed, but I think our reasoning about using a timezone setting associated with a country/state is reasonable (I will create a separate issue for that).

The problem was I had added 3 hours rather than subtracted 3 hours. So instead of the date ticking over at 3am it was ticking over at 9pm. Would you have done some edits before 9pm and some after?

BTW, Gara Gorge is giving an accurate +10hours timezone.

— Reply to this email directly or view it on GitHub https://github.com/theCrag/website/issues/2079#issuecomment-142487380.

scd commented 8 years ago

If it was tlc vs blc the stream would be set to blc not Australia. On Sep 23, 2015 7:34 PM, "Brendan Heywood" notifications@github.com wrote:

9pm was definitely crossed so that makes perfect sense

On the tlc issue gara has a fairly nested crags within crags so maybe a blc vs tlc issue?

Sent from my iPhone

On 23/09/2015, at 2:32 PM, Simon Dale notifications@github.com wrote:

Timezone issue has been identified and fixed, but I think our reasoning about using a timezone setting associated with a country/state is reasonable (I will create a separate issue for that).

The problem was I had added 3 hours rather than subtracted 3 hours. So instead of the date ticking over at 3am it was ticking over at 9pm. Would you have done some edits before 9pm and some after?

BTW, Gara Gorge is giving an accurate +10hours timezone.

— Reply to this email directly or view it on GitHub https://github.com/theCrag/website/issues/2079#issuecomment-142487380.

— Reply to this email directly or view it on GitHub https://github.com/theCrag/website/issues/2079#issuecomment-142545574.

brendanheywood commented 7 years ago

See also #2083 for timezone calc

scd commented 3 years ago

It looks like at least some of this is fixed and no futher reports of similar problems in the last three years so closing.