banyanops / collector

A framework for Static Analysis of Docker container images
Other
289 stars 23 forks source link

Looping for Metadata #12

Closed kudva closed 8 years ago

kudva commented 9 years ago

I followed the instructions to clone from git, and ran the following command:

I have docker running on the host and can start containers with the same image as banyanops/nginx

But when I ran: sudo COLLECTOR_DIR=$PWD $GOPATH/bin/collector index.docker.io banyanops/nginx

It goes into this infinite loop of attempting to get some metadata from docker hub (as shown below) This has been going on for 8 hours or more. What am I doing wrong?

Thanks for you help!

[23:44:15 2015/06/17 -0500] INFO Looping in 60 seconds [23:45:15 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:45:16 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:45:18 2015/06/17 -0500] INFO No new metadata in this iteration [23:45:18 2015/06/17 -0500] INFO Looping in 60 seconds [23:46:18 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:46:19 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:46:20 2015/06/17 -0500] INFO No new metadata in this iteration [23:46:20 2015/06/17 -0500] INFO Looping in 60 seconds [23:47:20 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:47:21 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:47:23 2015/06/17 -0500] INFO No new metadata in this iteration [23:47:23 2015/06/17 -0500] INFO Looping in 60 seconds [23:48:23 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:48:24 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:48:25 2015/06/17 -0500] INFO No new metadata in this iteration [23:48:25 2015/06/17 -0500] INFO Looping in 60 seconds [23:49:25 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:49:26 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:49:28 2015/06/17 -0500] INFO No new metadata in this iteration [23:49:28 2015/06/17 -0500] INFO Looping in 60 seconds [23:50:28 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:50:29 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:50:30 2015/06/17 -0500] INFO No new metadata in this iteration [23:50:30 2015/06/17 -0500] INFO Looping in 60 seconds [23:51:30 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:51:31 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:51:33 2015/06/17 -0500] INFO No new metadata in this iteration [23:51:33 2015/06/17 -0500] INFO Looping in 60 seconds [23:52:33 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:52:34 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:52:37 2015/06/17 -0500] INFO No new metadata in this iteration [23:52:37 2015/06/17 -0500] INFO Looping in 60 seconds [23:53:37 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:53:39 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:53:41 2015/06/17 -0500] INFO No new metadata in this iteration [23:53:41 2015/06/17 -0500] INFO Looping in 60 seconds [23:54:41 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:54:42 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:54:43 2015/06/17 -0500] INFO No new metadata in this iteration [23:54:43 2015/06/17 -0500] INFO Looping in 60 seconds [23:55:43 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:55:44 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:55:46 2015/06/17 -0500] INFO No new metadata in this iteration [23:55:46 2015/06/17 -0500] INFO Looping in 60 seconds [23:56:46 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:56:46 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:56:48 2015/06/17 -0500] INFO No new metadata in this iteration [23:56:48 2015/06/17 -0500] INFO Looping in 60 seconds [23:57:48 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:57:49 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:57:50 2015/06/17 -0500] INFO No new metadata in this iteration [23:57:50 2015/06/17 -0500] INFO Looping in 60 seconds [23:58:50 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:58:51 2015/06/17 -0500] INFO Get Tags and Metadata from Docker Hub [23:58:53 2015/06/17 -0500] INFO No new metadata in this iteration [23:58:53 2015/06/17 -0500] INFO Looping in 60 seconds

[23:59:53 2015/06/17 -0500] INFO Get Repos from Docker Hub [23:59:54 2015/06/17 -0500] INFO Get Tags and Metadata from [23:59:55 2015/06/17 -0500] INFO No new metadata in this iteration [23:59:55 2015/06/17 -0500] INFO Looping in 60 seconds [00:00:55 2015/06/18 -0500] INFO Get Repos from Docker Hub [00:00:56 2015/06/18 -0500] INFO Get Tags and Metadata from Docker Hub [00:00:57 2015/06/18 -0500] INFO No new metadata in this iteration [00:00:57 2015/06/18 -0500] INFO Looping in 60 seconds

yoshiotu commented 9 years ago

For this test case, all the action is supposed to happen only at the very beginning of the run. Collector processes all the images it finds from the specified repository (banyanops/nginx which has only 2 images). After that, it continues to poll in case any new images show up, but that won't happen since we rarely update the images or tags in this test repository. Can you please check for output files in the following directory? $HOME/.banyan/hostcollector/

If all is well, you should see the file "imagelist" containing the IDs of the images that were processed, and a sub-directory "banyanout" containing some image metadata that was collected from the registry and the data gathered from each image by the "pkgextract" and "listUsers" scripts.

Here's what happens for me (except for the authconfig which I blanked out with XXXXXXX -- sorry, Collector shouldn't log sensitive data, I will remove that from the code right away):

~/gospace/src/github.com/banyanops/collector$ COLLECTOR_DIR=$PWD collector index.docker.io banyanops/nginx [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan/hostcollector [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan/hostcollector [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan/hostcollector/banyanout [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan/hosttarget/defaultscripts [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan/hosttarget/userscripts [22:20:03 PDT 2015/06/17] DEBG Creating directory: /home/yoshiotu/.banyan/hosttarget/bin [22:20:04 PDT 2015/06/17] INFO Repolist: /home/yoshiotu/.banyan/hostcollector/repolist not specified [22:20:04 PDT 2015/06/17] INFO Limiting collection to the following repos: [22:20:04 PDT 2015/06/17] INFO banyanops/nginx [22:20:04 PDT 2015/06/17] INFO authconfig is XXXXXXX [22:20:04 PDT 2015/06/17] INFO registry API URL: https://index.docker.io [22:20:04 PDT 2015/06/17] WARN open /home/yoshiotu/.banyan/hostcollector/imagelist: no such file or directory : Error in opening /home/yoshiotu/.banyan/hostcollector/imagelist : perhaps a fresh start? [22:20:04 PDT 2015/06/17] INFO Fresh start: No previously collected images were found in /home/yoshiotu/.banyan/hostcollector/imagelist [22:20:04 PDT 2015/06/17] INFO Get Repos from Docker Hub [22:20:05 PDT 2015/06/17] INFO Get Tags and Metadata from Docker Hub [22:20:08 PDT 2015/06/17] INFO Get Metadata for Image: 02a791aafe156afd78978782d7b75af029dbabcd249c0adf02b0e3d231e66224 [22:20:08 PDT 2015/06/17] INFO Get Metadata for Image: 42a3cf88f3f0cce2b4bfb2ed714eec5ee937525b4c7e0a0f70daff18c3f2ee92 [22:20:08 PDT 2015/06/17] INFO Get Metadata for Image: 42a3cf88f3f0cce2b4bfb2ed714eec5ee937525b4c7e0a0f70daff18c3f2ee92 [22:20:09 PDT 2015/06/17] INFO Obtained 3 new metadata items in this iteration [22:20:09 PDT 2015/06/17] INFO Appending image metadata to file... [22:20:09 PDT 2015/06/17] INFO Starting to apply maxImages limit to repo banyanops/nginx [22:20:09 PDT 2015/06/17] INFO PullImages downloading /images/create?fromImage=index.docker.io/banyanops/nginx:1.9, Image ID: 42a3cf88f3f0cce2b4bfb2ed714eec5ee937525b4c7e0a0f70daff18c3f2ee92 [22:20:13 PDT 2015/06/17] INFO PullImages downloading /images/create?fromImage=index.docker.io/banyanops/nginx:1.7, Image ID: 02a791aafe156afd78978782d7b75af029dbabcd249c0adf02b0e3d231e66224 [22:20:17 PDT 2015/06/17] INFO Executing command: docker [PATH=/banyancollector/bin:$PATH bash-static /banyancollector/defaultscripts/pkgextractscript.sh] [22:20:17 PDT 2015/06/17] INFO Got ID f467089cd24deb3347162ac711d4bfadbc520c5cd35e5179d53bf6584cf6aa3b Warnings

[22:20:17 PDT 2015/06/17] INFO Got StatusCode 0

[22:20:17 PDT 2015/06/17] INFO Executing command: docker [PATH=/banyancollector/bin:$PATH python-static /banyancollector/userscripts/listUsers.py] [22:20:17 PDT 2015/06/17] INFO Got ID f600f48a35c89c48551e87d64107f55d1da6b2b6772c778230fd78c98f4d0116 Warnings

[22:20:18 PDT 2015/06/17] INFO Got StatusCode 0

[22:20:18 PDT 2015/06/17] INFO Executing command: docker [PATH=/banyancollector/bin:$PATH bash-static /banyancollector/defaultscripts/pkgextractscript.sh] [22:20:18 PDT 2015/06/17] INFO Got ID e3fe0f75fce37e4494985c42718a20a28beed11d797135d6c7138ef3978bf7ab Warnings

[22:20:18 PDT 2015/06/17] INFO Got StatusCode 0

[22:20:18 PDT 2015/06/17] INFO Executing command: docker [PATH=/banyancollector/bin:$PATH python-static /banyancollector/userscripts/listUsers.py] [22:20:18 PDT 2015/06/17] INFO Got ID eb2618d1e098e9d24500d28499307a76ec23a184b4eb16bc1a2d66148263b157 Warnings

[22:20:19 PDT 2015/06/17] INFO Got StatusCode 0

[22:20:19 PDT 2015/06/17] INFO Writing image (pkg and other) data into file... [22:20:19 PDT 2015/06/17] INFO Writing /home/yoshiotu/.banyan/hostcollector/banyanout/pkgextractscript/42a3cf88f3f0-pkgdata... [22:20:19 PDT 2015/06/17] INFO Writing /home/yoshiotu/.banyan/hostcollector/banyanout/listUsers/42a3cf88f3f0-miscdata... [22:20:19 PDT 2015/06/17] INFO Writing /home/yoshiotu/.banyan/hostcollector/banyanout/pkgextractscript/02a791aafe15-pkgdata... [22:20:19 PDT 2015/06/17] INFO Writing /home/yoshiotu/.banyan/hostcollector/banyanout/listUsers/02a791aafe15-miscdata... [22:20:19 PDT 2015/06/17] INFO Looping in 60 seconds

kudva commented 9 years ago

Hi, thanks for quick response. So, I did look at the banyanout directory before, and it looks like this (after the collector run): nodename: ~/.banyan/hostcollector/banyanout$ ls -lt total 16 -rw-r--r-- 1 root root 7373 Jun 18 00:36 metadata.json drwxr-xr-x 2 root root 4096 Jun 18 00:29 listUsers drwxr-xr-x 2 root root 4096 Jun 18 00:29 pkgextractscript

I don't see any other output files. Am I missing the output data?

The metadata.json file is fairly small and looks like this with only 234 lines, about 20 entries in this form. { "Image": "02a791aafe156afd78978782d7b75af029dbabcd249c0adf02b0e3d231e66224", "Datetime": "2015-04-22T05:47:34.38360396Z", "Repo": "banyanops/nginx", "Tag": "1.7", "Size": 0, "Author": "NGINX Docker Maintainers \"docker-maint@nginx.com\"", "Checksum": "", "Comment": "", "Parent": "9fc02b3a9859514195abc653cf61e7554e6e02d926ed18a7127c08bac734ddac" }

yoshiotu commented 9 years ago

Looks good. Anything in the pkgextractscript and listUsers directories? Should be one file per image in each directory.

Thanks, Yoshio (sent from my phone)

On Jun 17, 2015, at 11:04 PM, kudva notifications@github.com wrote:

Hi, thanks for quick response. So, I did look at the banyanout directory before, and it looks like this (after the collector run): nodename: ~/.banyan/hostcollector/banyanout$ ls -lt total 16 -rw-r--r-- 1 root root 7373 Jun 18 00:36 metadata.json drwxr-xr-x 2 root root 4096 Jun 18 00:29 listUsers drwxr-xr-x 2 root root 4096 Jun 18 00:29 pkgextractscript

I don't see any other output files. Am I missing the output data?

The metadata.json file is fairly small and looks like this with only 234 lines, about 20 entries in this form. { "Image": "02a791aafe156afd78978782d7b75af029dbabcd249c0adf02b0e3d231e66224", "Datetime": "2015-04-22T05:47:34.38360396Z", "Repo": "banyanops/nginx", "Tag": "1.7", "Size": 0, "Author": "NGINX Docker Maintainers \"docker-maint@nginx.com\"", "Checksum": "", "Comment": "", "Parent": "9fc02b3a9859514195abc653cf61e7554e6e02d926ed18a7127c08bac734ddac" }

— Reply to this email directly or view it on GitHub.

kudva commented 9 years ago

Thanks again! Yes, I see a file per image image each. So, it seems the data contains there parts: 1. list of package distros and versions to be inspected. 2. list of volumes to be inspected 3. The metadatafile with the checksums I mentioned. But I don't see the results of the actual inspections (where are they?)

yoshiotu commented 9 years ago

Collector systematically gathers data from images, using the default scripts included in this repository plus any scripts you provide. The output can then be passed to separate tools or services, for things like visualization, deeper analysis, etc. Our proprietary analysis service is in private beta (issue #13).