kamu-data / kamu-cli

Next-generation decentralized data lakehouse and a multi-party stream processing network
https://kamu.dev
Other
303 stars 13 forks source link

Documentation and Installation question(s) #34

Closed JvD007 closed 3 years ago

JvD007 commented 3 years ago

Some questions around installation and user documentation:

Create a dataset on S3 etc

TX, Jaco

sergiimk commented 3 years ago

Hi Jaco, thanks for the feedback!

I agree, we will focus on better first-time user experience in the next few days including:

I will be letting you know of the progress here as we add/update things.

Regarding docker, we use it for these purposes:

So I think getting rid of docker completely can be our long-term goal. But in short term we can start moving more and more features out of the docker allowing majority of users (data consumers) to use kamu without it.

For example:

Overall docker is becoming more and more popular in data science community, as it simplifies repeatability and sharing of data projects. So if you encounter some specific issues with it we can try help you address them (e.g. link a step-by-step guide on setting up docker with WSL2.

sergiimk commented 3 years ago

In #35 I've added detailed help to all commands along with common usage examples (released in v0.38.2).

I also gave Metadata Reference a better structure to make the job of writing dataset manifests easier.

Also added a new section of documentation on Merge Strategies with examples.

sergiimk commented 3 years ago

Update: kamu now supports podman for running containers without the need for sudo or any privilege escalation possibilities.