seatgeek / api-support

A support channel for the SeatGeek Platform
9 stars 7 forks source link

Data anomaly? #31

Closed emilymerwin closed 7 years ago

emilymerwin commented 7 years ago

I've been capturing the super bowl summary stats every 30 minutes since Tuesday morning and just wanted to check on two data points that look strange before I publish my chart - I'm seeing a lowest_price value of $111 at 10 a.m. EST on Tuesday at noon EST and a lowest_price value of $17 at 11pm EST on Wednesday. Everything else is clustering around $3300 pretty consistently so I wanted to check if maybe something got miscategorized?

screen shot 2017-01-26 at 3 06 46 am
josegonzalez commented 7 years ago

It's pretty likely that someone actually mis-listed their tickets and very quickly updated the prices. This happens every so often, and sometimes you can get a really good deal because someone fat-fingered the pricing when listing their tickets (either on our exchange or via a partner).

Alternatively, someone actually attempted purchased those tickets and they were removed from the market.

emilymerwin commented 7 years ago

Ok, so now I'm still seeing highest_price returning $457,132 but on SeatGeek's website the highest one I see is $167k. Are there hidden tickets? Stale data? Is the API summary data live, and if not how frequently is it updated? screen shot 2017-01-27 at 12 08 00 am

josegonzalez commented 7 years ago

As far as stale data, we have algorithms that tell us how often to hit partners for ticket pricing information, which then very quickly updates our api (barring the small amount of cache time). High traffic/value events are updated extremely frequently. As well, event page data gets sent to our API in a very small amount of time, though we have no specific guarantees as to when it will update. We use this data on our team pages, so we try and keep the latency low, regardless of the event that is being updated.

Note: If you load an event page, you will see the most up to date information that we have.

We try and match tickets based on section and row - in many cases, our partners cannot or do not return seat-level data. When we match tickets based on section/row, we only show the cheapest ticket - they are functionally the same to our audience when we do not have section/row information - so its possible that the $457k ticket price which was hidden in a ticket group. You can see this "hidden" information by clicking on the ticket group.

Here is an example where - upon clicking the listing row - we show alternatives (as well as other ticket info) for a specific ticket group:

https://cl.ly/2I1m1i2d0B2b

One thing we may do for certain high-value events is hide un-mappable tickets. This can happen when a seller lists tickets in a way that isn't recognized by our system. We try really hard to do the right thing here, so if we can't map them, there is always the chance that they are incorrectly listed.

In general, when making technical decisions about what and how to list, we err on the side of the buyer, as we would prefer that those using our site have the best experience possible.

If you have any other questions, I should be able to put you in contact with our someone on our PR team, who would be more than happy to help answer any questions you have. Feel free to email me (jose [at] seatgeek) and I can get the ball rolling.

emilymerwin commented 7 years ago

That's very helpful thank you! I just wanted to understand what was going on.