bwarren2 / datadrivendota

Codebase for dota analytics
Other
0 stars 0 forks source link

Replay data storage #540

Closed bwarren2 closed 8 years ago

bwarren2 commented 8 years ago

Stabilizing around the state data as a predictable list of objects with conventional names. Time lapse looks neat.

The file format for state files has stabilized around something like this:

[
        {
            offset_time: 0,
            base_damage: 10,
            <bonus_damage: 10,>
            <total_damage: 10,>
        },
        {...},
        {...},
]

Virtues of this approach: the accessor functions applied to a pms or a side/match rollup are the same, because everything is a list objects with the same names; only the values and filenames would change. This structure is presumed for the time-lapse feature, which works really well and looks really cool.

For discussion, an alternative that might be more extensible is

{
    meta:{
        version: 1.x
    },
    data: [
        {},
        {},
        {
            offset_time: 0,
            base_damage: 10,
            <bonus_damage: 10,>
            <total_damage: 10,>
        },
    ]
}

is this worth it? Add it later? What happens when we add a field; reparse all the old matches that don't have it?

bwarren2 commented 8 years ago

One thing: the current form implementation in time_lapse.html is hacky and bad, and should probably be a django form or crispy form. Maybe django form for now, and prettify with revenue?

wlonk commented 8 years ago

The idea of versioning your data is compelling, but your point about "so, what do we do if it's not up to date" stands. Also, versioning info that lives in the file on S3 is a huge pain compared to meta info that's present in the path. Just worth considering.

I'd suggest including versioning in the path, ginning up a migrations system to handle old data being reparsed, and being certain to include a certain degree of support for data that is incomplete by current standards—i.e. failing gracefully, at least.

bwarren2 commented 8 years ago

Frustratingly, the failures in Valve's API are producing dependencies between this and more task upgrade work. This is 90% done but invisible to the public; I am merging it to deal with db migrations and add more error handling to the API calls.