cloudviz / agentless-system-crawler

A tool to crawl systems like crawlers for the web
Apache License 2.0
117 stars 44 forks source link

Append missing image RepoTag for crawling an image on private repository #355

Open tatsuhirochiba opened 6 years ago

tatsuhirochiba commented 6 years ago

Description

Crawler retrieves image RepoTag of container in dockerutils.py, but crawler only keeps the first entry in RepoTags. While crawling an image on private repository, the image might have two or more RepoTags within its metadata. It occurs when multiple images on the host refer to the same image id. In this situation, the second RepoTag might include required information for image crawling. In order to process this case, crawler should append not only the first entry of RepoTags, but also the remaining entries.

How to Reproduce

When we tag an image and push it to private repository, two repotags are assigned to both images.

root@host:~# docker tag mydebian:test mycluster.icp:8500/default/mydebian:test
root@host:~# docker push mycluster.icp:8500/default/mydebian:test
root@host:~# docker images | head
REPOSITORY                                              TAG                 IMAGE ID            CREATED             SIZE
mycluster.icp:8500/mydebian                             test                65db38df9855        10 hours ago        100MB
mydebian                                                test                65db38df9855        10 hours ago        100MB

root@host:~# docker image inspect mycluster.icp:8500/mydebian:test
[
    {
        "Id": "sha256:65db38df9855e0ef5df019e6dba6f3df68a32293a030244de2d2eda80c465658",
        "RepoTags": [
            "mydebian:test",
            "mycluster.icp:8500/default/mydebian:test"
        ],
...

For reg crawler, the second repotag is required for making namespace.

Log Output

Debugging Commands Output

Output of docker version:

(paste your output here)

Output of docker info:

(paste your output here)

Output of python --version:

(paste your output here)

Output of pip freeze:

(paste your output here)