Closed yarikoptic closed 7 years ago
commands which could be used:
E.g.
$> apt-cache policy libc6-dev
libc6-dev:
Installed: 2.19-18+deb8u4
Candidate: 2.19-18+deb8u4
Version table:
*** 2.19-18+deb8u4 0
500 http://debian.csail.mit.edu/debian/ jessie/main amd64 Packages
100 /var/lib/dpkg/status
2.19-18+deb8u3 0
500 http://security.debian.org/ jessie/updates/main amd64 Packages
$> apt-cache showpkg libc6-dev | head
Package: libc6-dev
Versions:
2.19-18+deb8u4 (/var/lib/apt/lists/debian.csail.mit.edu_debian_dists_jessie_main_binary-amd64_Packages) (/var/lib/dpkg/status)
Description Language:
File: /var/lib/apt/lists/debian.csail.mit.edu_debian_dists_jessie_main_binary-amd64_Packages
MD5: 1bbdc717d9acdb44db940928d570e749
Description Language: en
File: /var/lib/apt/lists/debian.csail.mit.edu_debian_dists_jessie_main_i18n_Translation-en
MD5: 1bbdc717d9acdb44db940928d570e749
$> head /var/lib/apt/lists/debian.csail.mit.edu_debian_dists_jessie_Release
Origin: Debian
Label: Debian
Suite: stable
Version: 8.5
Codename: jessie
Date: Sat, 04 Jun 2016 13:24:54 UTC
Architectures: amd64 arm64 armel armhf i386 mips mipsel powerpc ppc64el s390x
Components: main contrib non-free
Description: Debian 8.5 Released 04 June 2016
MD5Sum:
so we generate
distributions:
- name: debian-1
origin: Debian
label: Debian
suite: stable
version: 8.5
codename: jessie
date: Sat, 04 Jun 2016 13:24:54 UTC
components: main contrib non-free
architectures: amd64
packages:
- name: libc6-dev
version: 2.19-18+deb8u4 # from apt-cache policy
architecture: amd64 # as identified from /var/..._<arch=amd64>_Packages filename
distribution: debian-1
suite: main # as identified from /var/..._<suite=main>_binary-<arch>.Packages
I think this should be sufficient information to then later on to identify an apt repository(ies) (from archives.debian.org or snapshots.debian.org) which would be providing this particular package.
In Python implementation deb822 module could provide many useful helpers to read those Release and possibly other files.
*In [6]: deb822.Release(codecs.open('/var/lib/apt/lists/debian.csail.mit.edu_debian_dists_jessie_Release', 'r', 'utf-8')).keys()
Out[6]:
['Origin',
'Label',
'Suite',
'Version',
'Codename',
'Date',
'Architectures',
'Components',
'Description',
'MD5Sum',
'SHA1',
'SHA256']
and even possible to extract directly the version of installed package from 'status' file:
*In [19]: [p for p in deb822.Packages.iter_paragraphs(codecs.open('/var/lib/dpkg/status', 'r', 'utf-8')) if p['Package'] == 'libc6-dev'][0]['Version']
not sure how yet possible via this pythonic way to link to the '_Packages' file to identify the (In)Release file (I think that showpkg just efficiently scans all of those, so we might just as well use showpkg's cmdline output)
note 1: we might/should be able to state to ignore/override distribution:
of the package as well, so we could e.g. regenerate env originally built on ubuntu, on a debian base. So pretty much similar overrides to version:
specification should be allowed
@rbuccigrossi , as you have played most with reprozip, do you think it is a viable idea to implement "cheaply" within reprozip?
I became really enamored by reprozip's ability to record the execution of an experiment and play it back. But faced with this question, for playback it really is only a small step ahead of Ansible, Docker, Packer, and other environment creation scripts because since:
But if we aren't using reprozip's recording capability, we may be able to create an even simpler YAML format (possibly based upon reprozip stripping out many things we don't need), and use that in a Docker (Ansible, or Packer, etc.) configuration script.
Now that we have a bunch of potential tools on the table, I suggest we go back to our proposal, extract our goals for NICEMAN, and see if we can come up with a single page treatment of the key requirements in light of what we now have on hand...
;-) correct -- so see/add/... that Repronim A/CL PI
document (I just also emailed on the list about that) we have where I am just sketching possible high level use-cases.
Indeed, not everything in reprozip's trace file might be needed to reproduce the environment BUT greedy me thinks that as long as we (or reprozip) traces execution, it should collect as much of detail as possible. E.g. what other aspect reprozip (or nidm for that matter AFAIK) doesn't trace, although could, ATM is resource requirements. Even though not precise, they could provide a ballpark for necessary resources.
this is somewhat addressed already in the codebase ATM -- we are collecting extended amount of information about deb packages/apt repos
So we have sufficient information to reproduce any given environment later on