ReproNim / reproman

ReproMan (AKA NICEMAN, AKA ReproNim TRD3)
https://reproman.readthedocs.io
Other
24 stars 14 forks source link

apt sources analysis assumes netloc #59

Closed yarikoptic closed 7 years ago

yarikoptic commented 7 years ago

didn't check yet but we need to verify that we could handle/record file:// url sources correctly, where .netloc could be empty.

on a quick check where I have generated a local repo with

mkdir /tmp/repo
cp some.deb /tmp/repo
cd /tmp/repo
dpkg-scanpackages . >| Packages
apt-ftparchive release . >| Release
echo 'deb file:///tmp/repo ./' > /etc/apt/sources.list.d/local.list
apt-get update

I have got:

lrwxrwxrwx 1 root root       20 Feb 20 13:09 _tmp_repo_._Packages -> /tmp/repo/./Packages
-rw-r--r-- 1 root root      816 Feb 20 13:08 _tmp_repo_._Release

so might be worth a quick test of some kind to check if new code in https://github.com/ReproNim/niceman/pull/58/files#diff-e6a1b8061859cc341e0dadc10ae5574dR288 handles it correctly

rbuccigrossi commented 7 years ago

Grrr... I installed my open copy of alienblaster into a local repo "/my/repo":

mkdir /my
mkdir /my/repo
cd /my/repo
wget http://mirrors.kernel.org/ubuntu/pool/universe/a/alienblaster/alienblaster_1.1.0-9_amd64.deb
dpkg-scanpackages . >| Packages
apt-ftparchive release . >| Release
echo 'deb file:///my/repo ./' > /etc/apt/sources.list.d/local.list
apt-get update
apt-get install alienblaster

Then I created a reprozip trace file with the file "/usr/games/alienblaster". Unfortunately the Origin comes back as:

- {name: apt____0, type: apt, component: '', label: '', site: '', origin: '', archive: ''}

The origin information from the apt library does not include any other information to determine the local location. However, the version object does have URIs (including my local one: 'file:///my/repo/pool/universe/a/alienblaster/alienblaster_1.1.0-9_amd64.deb' ). So, I may be able to get the URI from the version and possibly match it with the origin to get the local repository location. However, I will hunt through the lower level apt_pkg library to see if I can get more direct access to the policy. apt-cache policy alienblaster returns more useful information:

alienblaster:
  Installed: 1.1.0-9
  Candidate: 1.1.0-9
  Version table:
 *** 1.1.0-9 500
        500 http://us.archive.ubuntu.com/ubuntu xenial/universe amd64 Packages
        500 file:/my/repo ./ Packages
        100 /var/lib/dpkg/status
yarikoptic commented 7 years ago

Thanks for the investigation! May be we should place it into the back burner for now since anyways those apt sources wouldn't be easily reproducible

rbuccigrossi commented 7 years ago

Well, if we delve deeper into the apt objects (specifically into the private member _cand) we can find out the URI associated with the origin.

Here's the loop to find the origins from the versions (from package.py):

        origins = []
        for (packagefile, _unused) in self._cand.file_list:
            origins.append(Origin(self.package, packagefile))
        return origins

And here's the loop to find the uris (from package.py):

       for (packagefile, _unused) in self._cand.file_list:
            indexfile = self.package._pcache._list.find_index(packagefile)
            if indexfile:
                yield indexfile.archive_uri(self._records.filename)

So now I know that the file_list member has more information about the origins, and maybe if I query that more directly I can get better information about local repositories.

rbuccigrossi commented 7 years ago

Look at that... the IndexFile, associated with the version's origin, has exactly what I need: ArchiveURI. Here is the apt_pkg PackageFile and IndexFile objects associated with one external and one local origin:

External:

<apt_pkg.PackageFile object: filename:'/var/lib/apt/lists/us.archive.ubuntu.com_ubuntu_dists_xenial_universe_binary-amd64_Packages'  
a=xenial,c=universe,v=16.04,o=Ubuntu,l=Ubuntu arch='amd64' site='us.archive.ubuntu.com' IndexType='Debian Package Index' Size=4181355
2 ID:6>
<pkIndexFile object: Label:'Debian Package Index' Describe='http://us.archive.ubuntu.com/ubuntu xenial/universe amd64 Packages (/var/
lib/apt/lists/us.archive.ubuntu.com_ubuntu_dists_xenial_universe_binary-amd64_Packages)' Exists='1' HasPackages='1' Size='41813552'  
IsTrusted='1' ArchiveURI='http://us.archive.ubuntu.com/ubuntu/'>

Internal:

<apt_pkg.PackageFile object: filename:'/var/lib/apt/lists/_my_repo_._Packages'  a=,c=,v=,o=,l= arch='' site='' IndexType='Debian Package Index' Size=1024 ID:43>
<pkIndexFile object: Label:'Debian Package Index' Describe='file:/my/repo ./ Packages (/var/lib/apt/lists/_my_repo_._Packages)' Exists='1' HasPackages='1' Size='1024'  IsTrusted='0' ArchiveURI='file:///my/repo/'>

So if I grab the ArchiveURI and associate that with the Origin, then we'd be able to perfectly recreate the Release file name for both internal and external repositories (and therefore get the date).

yarikoptic commented 7 years ago

wonderful digging ;) eventually we better though manage to make a good set of "functionality" (without mocking) tests for it so we could verify compatibility/operation across different debians/ubuntus

rbuccigrossi commented 7 years ago

By using the lower level apt_pkg PackageFile object, I am able to get access to the package filename, which has a close relationship to the release filename. Specifically, the release filename is a subset of the package filename (excluding "Release" or "InRelease").

So by getting the package filename and progressively slicing at the right-most underscore, I'm able to much more reliably get the release filename. This worked with external and local sources, including an ugly one I made through the following:

mkdir /my
mkdir /my/repo2
cd /my/repo2
mkdir ubuntu
cd ubuntu
wget http://mirrors.kernel.org/ubuntu/pool/universe/a/alienblaster/alienblaster_1.1.0-9_amd64.deb
cd ..
dpkg-scanpackages ubuntu >| ubuntu/Packages
apt-ftparchive release ubuntu >| ubuntu/Release
echo 'deb file:///my/repo2 ubuntu/' >> /etc/apt/sources.list.d/local.list
apt-get update
apt-get install alienblaster

With the update I made to #63 I now get the origins:

origins:
- {name: apt_NeuroDebian_xenial_non-free_0, type: apt, origin: NeuroDebian, label: NeuroDebian,
  site: neuro.debian.net, archive: xenial, archive_uri: 'http://neuro.debian.net/debian/',
  component: non-free, date: '2017-02-22 16:11:11+00:00'}
- {name: apt_Ubuntu_xenial-updates_main_0, type: apt, origin: Ubuntu, label: Ubuntu,
  site: us.archive.ubuntu.com, archive: xenial-updates, archive_uri: 'http://us.archive.ubuntu.com/ubuntu/',
  component: main, date: '2017-02-22 22:30:31+00:00'}
- {name: apt_Ubuntu_xenial_main_0, type: apt, origin: Ubuntu, label: Ubuntu, site: us.archive.ubuntu.com,
  archive: xenial, archive_uri: 'http://us.archive.ubuntu.com/ubuntu/', component: main,
  date: '2016-04-21 23:23:46+00:00'}
- {name: apt_Ubuntu_xenial_multiverse_0, type: apt, origin: Ubuntu, label: Ubuntu,
  site: us.archive.ubuntu.com, archive: xenial, archive_uri: 'http://us.archive.ubuntu.com/ubuntu/',
  component: multiverse, date: '2016-04-21 23:23:46+00:00'}
- {name: apt_Ubuntu_xenial_universe_0, type: apt, origin: Ubuntu, label: Ubuntu, site: us.archive.ubuntu.com,
  archive: xenial, archive_uri: 'http://us.archive.ubuntu.com/ubuntu/', component: universe,
  date: '2016-04-21 23:23:46+00:00'}
- {name: apt____0, type: apt, origin: '', label: '', site: '', archive: '', archive_uri: 'file:///my/repo/',
  component: '', date: '2017-02-21 02:42:30+00:00'}
- {name: apt____1, type: apt, origin: '', label: '', site: '', archive: '', archive_uri: 'file:///my/repo2/',
  component: '', date: '2017-02-23 00:11:41+00:00'}
- {name: apt__now_now_0, type: apt, origin: '', label: '', component: now, site: '',
  archive: now}

I currently kept "archive_uri" in order to easily identify local repositories.