repology / repology-updater

Repology backend service to update repository and package data
https://repology.org
GNU General Public License v3.0
502 stars 177 forks source link

CVE support improvements #1043

Open AMDmi3 opened 4 years ago

AMDmi3 commented 4 years ago
AMDmi3 commented 4 years ago

Regarding editions, we need to support more fields:

The common format is cpe:2.3:<part>:<vendor>:<product>:<version>:<update>:<edition>:<lang>:<sw_edition>:<target_sw>:<target_hw>[:other]

We may filter on part, which is expected to be 'a' for applications (as opponsed to operating systems and hardware), edition is deprecated, and we need to additionally support at least sw_edition and target_sw

Docs: https://cpe.mitre.org/specification/#downloads

AMDmi3 commented 4 years ago

The plan for introduction of new cpe fields

AMDmi3 commented 4 years ago

List of CPEs which need to be added when all CPE fields are supporte

AMDmi3 commented 4 years ago

List of CVEs affecting only OS-specific release of some software. May be ignored for now, in future may require setting cpe_target_sw via repository config or splitting the software into os-specific items.

Apteryks commented 4 years ago

Hello!

Could it be possible to fallback to using the package name when no cpe-name property is found in the package definition? We use this in GNU Guix, it seems to work well.

Apteryks commented 4 years ago

civodul from #guix just mentioned that repology parses the data at 'guix.gnu.org/packages.json', which means a simple solution that doesn't require any change on your side would be to add the cpe metadata to every package in that list on our side. I'll try just this later.

AMDmi3 commented 4 years ago

I've asked about this on #guix a month ago.

17:49 -!- Irssi: Join to #guix was synced in 1 secs
17:53 < AMDmi3> Hi! I've a question about CPE support in Guix
17:53 < AMDmi3> I see some packages define `cpe-name`; is it really used for something?
17:54 < rekado> AMDmi3: yes, by “guix lint”
17:55 < AMDmi3> sorry, I'm in fact not guix user; what does it do?
17:58 < AMDmi3> aha, I've found docs, I see now
18:00 < AMDmi3> well then it would make sense to tell that I've added CPE support to repology some time ago, and it's now capable of reporting missing CPE information
18:00 < AMDmi3> see https://repology.org/repository/gnuguix/problems
18:02 < AMDmi3> also note that in ordere to have complete CVE search, you'd need a way to define multiple CPEs per package, and also be able to specify any CPE field, including, most importantly, vendor
18:02 < roptat_> that's not really useful, because guix lint will use the package name if no cpe-name is provided
18:02 < roptat_> at least, you have too many false positive
18:04 < roptat_> for instance, the first item in the list is "acl" and you suggest using "acl" as the product name, which is great, but guix lint already does that
18:04 < AMDmi3> uh huh, I can take this into account
18:05 < roptat_> that would make the page more interesting :)
18:07 < AMDmi3> although no: it would produce a bunch of false positives of another kind, CPEs without matches in NVD or CPE dictionary
18:08 < roptat_> just ignore those?
18:08 < roptat_> currently guix can only associate one cpe name to a package, it's either cpe-name if it is present, or the package's name
18:09 < roptat_> if you detect a cpe-name that does not correspond to any actual name, that's not really an issue
18:09 -!- raingloom [~raingloom@BC9CFD9A.catv.pool.telekom.hu] has quit [Ping timeout: 246 seconds]
18:09 < roptat_> an issue would be if a package has the wrong cpe name (implicit or explicit)
18:09 < AMDmi3> this way there's no telling whether assumed CPE is incorrect or there are just no CVEs for the product
18:15 < AMDmi3> well I've just wanted to inform that there's now a tool
18:16 -!- rgherdt [~rgherdt@2a02:8109:86c0:d8d:8dae:2a9e:6881:93a1] has quit [Ping timeout: 272 seconds]
18:16 < AMDmi3> if you want CPE matching to be reliable you'd probably need to always specify them explicitly, allow multiple of them and support all fields
--- Log closed Thu Jun 11 18:17:16 2020

As long as guix does not define CPE information explicitly, nothing useful can be done on Repology side. If we fallback to package name as cpe_product, CPE information is missing problems will turn into CPE unreferenced problems for each package for which there are no known CVEs or CPE dictionary entries, for there's no telling whether fallback CPE is incorrect, or there's no CPE needed for specific package at all.

Note that this only affects reported Problems, and does not affect Repology ability to report vulnerable package versions, as we use our own set of CPE bindings.

I'd say guix should instead improve CPE support by defining CPE information explicitly (ideally also use all cpe fields and maybe allow multiple CPE tuples), in which case Repology would be able to help fill/fix missing or incorrect CPEs.

Apteryks commented 4 years ago

I've asked about this on #guix a month ago. [...] If we fallback to package name as cpe_product, CPE information is missing problems will turn into CPE unreferenced problems for each package for which there are no known CVEs or CPE dictionary entries, for there's no telling whether fallback CPE is incorrect, or there's no CPE needed for specific package at all.

From my point of view, the "CPE unreferenced" problem would be more true and useful than a "CPE information is missing", at least in the context of Guix (because of the implicit package name == cpe_name unless otherwise explicitly defined).

I'd say guix should instead improve CPE support by defining CPE information explicitly (ideally also use all cpe fields and maybe allow multiple CPE tuples), in which case Repology would be able to help fill/fix missing or incorrect CPEs.

Honest question: Why are multiple CPE useful? I thought the raison d'être of a CPE was to be a unique.

Thank you for Repology; it 's a very useful tool.

AMDmi3 commented 4 years ago

From my point of view, the "CPE unreferenced" problem would be more true and useful than a "CPE information is missing", at least in the context of Guix (because of the implicit package name == cpe_name unless otherwise explicitly defined).

It won't be - as already said, you won't be able to tell packages which don't need CPE from packages which need cpe_name defined and from packages which have incorrect cpe_name - there would be a problem for each of these. With the current setting, "CPE missing" can at least be filtered to ignore packages with name == cpe_product, and "CPE unreferenced" reliably indicate incorrect data.

I thought the raison d'être of a CPE was to be a unique.

Yes, but in the real world it's far from true - see git for example. Sometimes different CPEs are historical, sometimes are just used interchangeably. Sometimes CPE information in NVD is fixed so they aren't even stable. Sometimes there are objective reasons, such as that curl and libcurl have separate CPE but are distributed together.

AMDmi3 commented 4 years ago

I can query potential problems for you though. Query for reference (packages in guix which have no cpe_name defined, which have related vulnerabilities with cpe_product different to package name, and no vulnerabilities with cpe_product matching package name):

SELECT DISTINCT
    name AS guix_name,
    vulnerable_projects.cpe_product AS cpe_name
FROM packages INNER JOIN vulnerable_projects USING (effname)
WHERE
    packages.repo='gnuguix' AND
    packages.cpe_product IS NULL AND
    packages.name != vulnerable_projects.cpe_product AND
    NOT EXISTS (SELECT * FROM vulnerable_projects vulnerable_projects1 WHERE vulnerable_projects1.cpe_product = packages.name)
ORDER BY guix_name

Results in:

-------------------------------+--------------------------------------
 ant-apache-bcel               | ant
 ant-junit                     | ant
 bash-minimal                  | bash
 bash-static                   | bash
 boinc-client                  | boinc
 boost-signals2                | boost
 cairo-xcb                     | cairo
 chocolate-doom                | chocolate_doom
 crawl                         | dungeon_crawl_stone_soup
 crawl-tiles                   | dungeon_crawl_stone_soup
 crispy-doom                   | crispy_doom
 crypto++                      | crypto\+\+
 cups-minimal                  | cups
 dropbear                      | dropbear_ssh
 e2fsck-static                 | e2fsprogs
 ecryptfs-utils                | ecryptfs_utils
 exfat-utils                   | exfat
 ffmpeg-jami                   | ffmpeg
 frrouting                     | free_range_routing
 gdk-pixbuf+svg                | gdk-pixbuf
 ghostscript-with-cups         | ghostscript
 ghostscript-with-cups         | gpl_ghostscript
 ghostscript-with-x            | ghostscript
 ghostscript-with-x            | gpl_ghostscript
 git-minimal                   | git
 glibc-locales                 | glibc
 groff-minimal                 | groff
 gtk+                          | gtk\+
 guile-readline                | guile
 guile-static-stripped         | guile
 guile-static-stripped-tarball | guile
 guile2.2-gnutls               | gnutls
 guile2.2-readline             | guile
 hdf5-parallel-openmpi         | hdf5
 hplip                         | linux_imaging_and_printing_project
 hplip-minimal                 | linux_imaging_and_printing_project
 httpd                         | http_server
 icu4c                         | international_components_for_unicode
 jack                          | jack2
 java-guava                    | guava
 java-log4j-1.2-api            | log4j
 java-log4j-api                | log4j
 java-log4j-core               | log4j
 java-xerces                   | xerces-j
 knot                          | knot_dns
 knot-resolver                 | knot_resolver
 ldb                           | samba
 libbson                       | c_driver
 libgc                         | garbage_collector
 libgc-back-pointers           | garbage_collector
 libltdl                       | libtool
 libpng-apng                   | libpng
 libtorrent-rasterbar          | libtorrent
 libungif                      | giflib
 mariadb-connector-c           | connector\/c
 mbedtls-apache                | mbed_tls
 menu-cache                    | libmenu-cache
 mesa-utils                    | mesa
 mtools                        | mformat
 ncurses-with-gpm              | ncurses
 network-manager               | networkmanager
 node                          | node.js
 node                          | nodejs
 nspr                          | netscape_portable_runtime
 perl-file-path                | file\:\:path
 perl-libwww                   | libwww-perl
 perl-xml-libxml               | xml-libxml
 poppler-qt4                   | poppler
 poppler-qt5                   | poppler
 python-flask                  | flask
 python-gdal                   | gdal
 python-httplib2               | httplib2
 python-ipython                | ipython
 python-keyring                | keyring
 python-libxml2                | libxml2
 python-lxml                   | lxml
 python-openpyxl               | openpyxl
 python-pandas                 | pandas
 python-pillow                 | pillow
 python-pip                    | pip
 python-py-bcrypt              | py-bcrypt
 python-pycrypto               | pycrypto
 python-pycryptodome           | pycryptodome
 python-pyjwt                  | pyjwt
 python-pyopenssl              | pyopenssl
 python-pyxdg                  | pyxdg
 python-pyyaml                 | pyyaml
 python-requests               | requests
 python-rply                   | rply
 python-scikit-learn           | scikit-learn
 python-setuptools             | setuptools
 python-typed-ast              | typed_ast
 python-urllib3                | urllib3
 python-virtualenv             | virtualenv
 python2-flask                 | flask
 python2-gnupg                 | python-gnupg
 python2-httplib2              | httplib2
 python2-ipython               | ipython
 python2-keyring               | keyring
 python2-libxml2               | libxml2
 python2-lxml                  | lxml
 python2-pandas                | pandas
 python2-pillow                | pillow
 python2-pip                   | pip
 python2-py-bcrypt             | py-bcrypt
 python2-pycrypto              | pycrypto
 python2-pycryptodome          | pycryptodome
 python2-pyjwt                 | pyjwt
 python2-pyopenssl             | pyopenssl
 python2-pyxdg                 | pyxdg
 python2-pyyaml                | pyyaml
 python2-requests              | requests
 python2-rply                  | rply
 python2-rsa                   | python-rsa
 python2-scikit-learn          | scikit-learn
 python2-setuptools            | setuptools
 python2-urllib3               | urllib3
 python2-virtualenv            | virtualenv
 qemu-minimal                  | qemu
 qgpgme                        | gpgme
 qtbase                        | qt
 qtcharts                      | qt
 qtconnectivity                | qt
 qtdatavis3d                   | qt
 qtdeclarative                 | qt
 qtgamepad                     | qt
 qtgraphicaleffects            | qt
 qtimageformats                | qt
 qtlocation                    | qt
 qtmultimedia                  | qt
 qtnetworkauth                 | qt
 qtpurchasing                  | qt
 qtquickcontrols               | qt
 qtquickcontrols2              | qt
 qtremoteobjects               | qt
 qtscript                      | qt
 qtscxml                       | qt
 qtsensors                     | qt
 qtserialbus                   | qt
 qtserialport                  | qt
 qtspeech                      | qt
 qtsvg                         | qt
 qttools                       | qt
 qtwayland                     | qt
 qtwebchannel                  | qt
 qtwebengine                   | qt
 qtwebglplugin                 | qt
 qtwebsockets                  | qt
 qtwebview                     | qt
 qtx11extras                   | qt
 qtxmlpatterns                 | qt
 ruby-nokogiri                 | nokogiri
 ruby-puma                     | puma
 ruby-rack                     | rack
 ruby-rails                    | rails
 ruby-rake                     | rake
 ruby-rubyzip                  | rubyzip
 ruby-sanitize                 | sanitize
 ruby-websocket-extensions     | websocket-extensions
 sane-backends-minimal         | sane-backends
 sdl-image                     | sdl_image
 sdl2                          | libsdl
 sdl2                          | sdl
 sdl2                          | simple_directmedia_layer
 sdl2-image                    | sdl2_image
 sox                           | sound_exchange
 tidy-html                     | tidy
 tigervnc-client               | tigervnc
 tigervnc-server               | tigervnc
 timidity++                    | timidity\+\+
 tintin++                      | tintin\+\+
 u-boot-tools                  | u-boot
 vim-full                      | vim
 vips                          | libvips
 wesnoth-server                | battle_for_wesnoth
 wesnoth-server                | wesnoth
 wine-minimal                  | wine
 wine-staging                  | wine
 wine64                        | wine
 wireless-tools                | wireless_tools
 wxwidgets-gtk2                | wxwidgets
 xapian                        | xapian-core
 xorg-server-xwayland          | xorg-server
 zabbix-agentd                 | zabbix
 zabbix-server                 | zabbix
Apteryks commented 4 years ago

@AMDmi3 hello, and thank you for this interesting query!

I'll look at how the CPE name databases available are organized and see if we can improve our linter to do something similar to what you've done above. And then, if we can have a good confidence level that the packages lacking explicit CPE names property have their package name match an actual CPE product, then I think it'd be a good first improvement to the results of your site that we add the implicitly cpe names attributes to the package.json list consumed by your service, I think.

AMDmi3 commented 4 years ago

There's not much to look at - they just use all CPE fields. I've tried to get away with just product and vendor, but it didn't work out. Using only product is just unthinkable. The most obvious examples are library + corresponding rust or python or node.js bindings which only differ in _targetsw field. There are examples in this issue.