Accuracy vs ais degredation

bitsofbits commented 8 years ago

closes #29 closes #11

Comparison of how precision / recall change as the AIS data is degraded by randomly dropping out point.

@redhog, please take a look at this; the results are suprisingly robust against degrading the AIS, so I'd like a second opinion. Please take a look at:

[x] Model-Sensitivity-to-AIS-Degradation.ipynb. This is the main product. Please take a look at the logic and make sure it's sensible
[x] scripts/add_features.py, scripts/add_all_features.py these are cleaned up versions add_measures.py and commands_to_add_measures.txt. Please sanity check them. I haven't deleted the originals yet, but we should eventually.

When we are happy with this, we should run it by David K – he's the one who originally asked for this.

redhog commented 8 years ago

I think we need to do this "backwards" also - train with the full dataset, but predict on a degraded dataset. Possibly also doing both at the same time. What do you think?

redhog commented 8 years ago

The top text is wrong:

We train as normal, then progessively randomly dropout more and more of the training data and see how that effects the resulting predicted results. We restrict ourselves to looking at a subset of the initial AIS points where there were 100-200 AIS points in the previous 24 hour period. This allows us to to see the effects of dropout at consistent point bins.

This should be test data, as that's what you're doing in the code.

bitsofbits commented 8 years ago

@redhog , good catch; fixed.

@davidkroodsma, can you take another look over this. If it looks good let us know and we can merge it into master.

GlobalFishingWatch / vessel-scoring

Accuracy vs ais degredation #51