droher / boxball

Prebuilt Docker images with Retrosheet's complete baseball history data for many analytical frameworks. Includes Postgres, cstore_fdw, MySQL, SQLite, Clickhouse, Drill, Parquet, and CSV.
Apache License 2.0
117 stars 16 forks source link

Need help? Want new features/endpoints? #50

Closed droher closed 1 year ago

droher commented 4 years ago

I'm looking to help make this data as accessible as possible for experienced researchers and novice data analysts alike -- please let me know if anything would help! This includes documentation, additional features, new data sources, or new load endpoints (including the ones listed in the other features so I can prioritize).

double-dose-larry commented 4 years ago

I see that Jupyter is on the list. What was your idea for integration?

droher commented 4 years ago

Create a running Jupyter notebook server inside the container and open up a port, so you could just navigate to localhost:xxxx and get a running notebook.

prrapo commented 4 years ago

First off, let me thank you for this project, it's been super fun as a baseball fan and a great way to hone data skills. I was wondering if it would be at all possible to generate armv7 (and arm64 potentially) images for those who want to run these containers on a Raspberry Pi 4. Docker's buildx tool provides a relatively pain-free way of generating out the images. I've tried for the past few days to generate my own images with a reverse-engineered Dockerfile but have run into a wall with a non-descriptive error message

droher commented 4 years ago

@prrapo You're welcome! This is another good reason to decouple the data from the images -- the load Dockerfiles all require an upstream image to load data from, and all of those images are using x64. If they were just downloading data from a file sharing site, it would hopefully be trivial to build an ARM postgres target exactly like the current one: https://hub.docker.com/r/arm64v8/postgres