iterative / dvc

πŸ¦‰ ML Experiments and Data Management with Git
https://dvc.org
Apache License 2.0
13.6k stars 1.18k forks source link

deb/rpm: build granular packages #2800

Closed Abrosimov-a-a closed 3 years ago

Abrosimov-a-a commented 4 years ago
  1. Big package size.
  2. Librarys duplication on client system.
  3. Need to upgrade package, when dependency is upgraded.
  4. The patches of the distribution developers are not used. As example: #2768
  5. Perhaps you build the package by hand?
  6. You can find many other information about that issue.

What is your current way to build packages? What about the system for automatically assembling the correct packages? I can help you with that.

Roadmap:

Current status:

Already in Debian repository:

Not in Debian:

efiop commented 4 years ago

Hi @Abrosimov-a-a ! Great question! We package it like that for simplicity, as we don't have to deal with packaging all our dependencies into deb and rpm packages. The way we build current deb and rpm packages is fully reflected by https://github.com/iterative/dvc/blob/master/scripts/build_posix.sh . We build binaries by pyinstaller and then package them into deb/rpm using fpm tool. Then we publish them on our s3 bucket using https://github.com/iterative/dvc-s3-repo . Turning these packages into proper python-based, would require packaging all our dependencies into deb/rpm packages first, which is tiresome, but could be simplified a lot by using that fpm tool, which supports converting python packages into deb/rpm. If you feel like it, we would very appreciate you taking a look πŸ™‚

Abrosimov-a-a commented 4 years ago

Hi @efiop ! I will see how it is more convenient to organize dependencies and building. Now I have found 5 dependencies with the problems. It's not so much.

What do you think about SCons?

efiop commented 4 years ago

@Abrosimov-a-a Yep, 5 deps doesn't sound too bad, indeed πŸ™‚

What do you think about SCons?

First time I hear about it, tbh. Could you elaborate please?

Also worth mentioning that we are about to have a snap package https://github.com/iterative/dvc/pull/2778 thanks to @casperdcl πŸ™ , but it does seem to suffer from similar limitations, as it runs in an isolated container.

casperdcl commented 4 years ago

@efiop SCons is a pythonic replacement for CMake.

Abrosimov-a-a commented 4 years ago

Next questions:

  1. There are two different ways. Simple (for your own repository). Hard (for Debian repository). What the way you need?
  2. What we gona do with dependencyes? Include to DVC package or build another packages?
efiop commented 4 years ago

@Abrosimov-a-a

There are two different ways. Simple (for your own repository). Hard (for Debian repository). What the way you need?

Creating missing packages for our dependencies and placing them on our s3 deb/rpm repo should be good enough.

What we gona do with dependencyes? Include to DVC package or build another packages?

I guess we need to package each dependency, to set this up correctly, right?

Abrosimov-a-a commented 4 years ago

@efiop

I guess we need to package each dependency, to set this up correctly, right?

Yes. It's more preferrable.

casperdcl commented 4 years ago

@efiop @shcheklein this is one reason people create *.deb python packages. apt-get install python-dvc could depend on libffi-dev (or libffi6) for example.

efiop commented 4 years ago

@Abrosimov-a-a Please let us know if you have any questions or need any help πŸ™‚

@casperdcl Good point!

Abrosimov-a-a commented 4 years ago

I make the roadmap and current status for all dependencies. Please, let me know if I forgot any dependency. @efiop I think creating a new git branch for this issue would be a good idea. I can work with this issue in my weekends only.

efiop commented 4 years ago

@Abrosimov-a-a Please create your own fork and use it to create a PR into the upstream, this is the preferred workflow for us. Let us know if you need any help, thank you so much for looking into that! πŸ™

Abrosimov-a-a commented 4 years ago

At this point, all dependencies are builded, except asciimatics. Asciimatics has some problems. Work in progress...

efiop commented 4 years ago

@Abrosimov-a-a Great news! Luckily we are getting rid of asciimatics in https://github.com/iterative/dvc/pull/2815 , so better just skip it for now :)

Abrosimov-a-a commented 4 years ago

@efiop Now I use Docker as building environment. I think it's a good idea for building many packages with many temporarily dependencies. It's keps the system clean. Dockerfile can be simple converted to the shell script. If you want to work in the native system.

efiop commented 4 years ago

@Abrosimov-a-a Sure, let's indeed use docker to build them, that sounds great!

Abrosimov-a-a commented 4 years ago

@efiop The building system sugres several additional packages:

  1. python3-networkx
  2. python3-yaml
  3. python3-flufl.lock

They are not in the setup.py. Add them to setup.py or exclude from package dependencies?

casperdcl commented 4 years ago

Afaik they're already in setup.py.

Abrosimov-a-a commented 4 years ago

@casperdcl Yes, my fault. Already in setup.py.

Abrosimov-a-a commented 4 years ago

Debian stable has some extra packages needed by DVC. But Debian version is older then setup.py needed. Are you sure you need a newer version of that packages?

Package          Debian ver.      DVC version.
---------------- ---------------- -----------------
python3-boto3    1.9.86-1         1.9.201
python3-arrow    0.12.1           0.14.0
python3-paramiko 2.4.2            2.5.0
python3-gssapi   1.4.1
Abrosimov-a-a commented 4 years ago

Now you can test new building system: in my repo.

efiop commented 4 years ago

@Abrosimov-a-a Are you talking about the stable deb repo? I'm pretty sure we'll be alright with those older versions. Btw, asciimatics is already removed πŸ˜‰

Abrosimov-a-a commented 4 years ago

Are you talking about the stable deb repo?

Yes.

I'm pretty sure we'll be alright with those older versions.

This will facilitate the work.

Btw, asciimatics is already removed

I know. I watched it.

Abrosimov-a-a commented 4 years ago

Waiting for your suggestions.

Abrosimov-a-a commented 4 years ago

So, what next?

  1. Integrate build_deb in dvc-s3-repo?
  2. Make RPM builder?
  3. Debian Experimental integration?
efiop commented 4 years ago
  1. and 2. sound good. Could you elaborate on 3, please?
ghost commented 4 years ago

~@Abrosimov-a-a , why is debian/compat file needed? it is just a 10~

https://github.com/iterative/dvc/blob/93280c937b9160003afb0d2f3fd473c03d6d9673/debian/compat#L1


Got it! https://www.debian.org/doc/manuals/maint-guide/dother.en.html

efiop commented 3 years ago

Our deps are rapidly changing and it will be a challenge to keep up with them with corresponding deb packages. Considering that there hasn't been much activity with the deb package, I'm removing unused code for now and closing this issue. If anyone is interested in granularly packaging deb, please see https://github.com/iterative/dvc-s3-repo where we already build and deploy our standalone deb/rpm packages on our public repositories. Granular deb logic should be added there.