lemire / RealisticTabularDataSets

Some realistic tabular datasets for testing (CSV)
19 stars 0 forks source link

Interest in geospatial data? #2

Open jamtho opened 6 years ago

jamtho commented 6 years ago

I noticed you didn't have a high density geospatial dataset, so have been giving some thought to creating an output for you based on the ship tracking (AIS) data published by Marine Cadastre: https://marinecadastre.gov/ais/

This data gives a surprisingly complete picture of what's been happening around the US coastline over a multi-year period, including around 15 million position-bearing messages per day from boats (in 2014), with many broadcasting once every few seconds. They're apparently working on data through to 2017 now.

You can make some really wonderful maps with it: https://twitter.com/underdarkGIS/status/929340406902542341 https://twitter.com/jamesjamtho/status/893046485956386816

Or you can dive deep into the histories of individual vessels e.g. seeing which other vessels they've interacted with, it's actually quite fun.

The output data I'd supply to this repository is fairly easy to work with. It would take the form of one CSV containing position-bearing messages: <ship_id, broadcast_timestamp, lat, lon, speed, course, ...>

And a second containing the identity data ships broadcast from time to time: <ship_id, broadcast_timestamp, ship_type, ship_length, ship_width, ...>

Generally people would import these CSVs into a relational database for OLAP-style work, or run through serial processes at the command line. I personally tend to convert the first (positional) set into file containing an array of pointerless C structs, which I access via mmap for performance; it works very cleanly. But you can stream through python very successfully, etc. It's really very regular.

Data volume is obviously determined by the geographic area and time period; it can get quite large if you want several days of the whole US and might end up dominating the repository, but smaller areas are still very interesting.

Does this sound like the kind of thing you're interested in having here? Completely understand if not!

lemire commented 6 years ago

That would be fantastic.