openmainframeproject / software-discovery-tool

Software Discovery Tool
Apache License 2.0
31 stars 40 forks source link

Documentation improvement: How to use the Back-end Data Sources #140

Closed pleia2 closed 3 months ago

pleia2 commented 1 year ago

It's somewhat unclear to contributors and users alike how the back-end data sources for the Software Discovery Tool are to be used and/or generated. We should prepare some concise documentation to cover this, something like the following, which I shared in Slack last week:

People who are using the core code of the tool basically have 3 options when it comes to the back-end sources, which we should keep in mind that we want to continue supporting.

  1. Pull in data sources that we, as a project, have prepared in https://github.com/openmainframeproject/software-discovery-tool-data (these should be refreshed monthly for everything that can be done programatically, but we need to do a better job of that). This is done with the submodule commands described in the installation documentation.
  2. Use bin/package_build.py themselves. This goes out to the canonical resource for all the distributions and such to pull in the latest data, and they manage keeping this updated on their installation. The only exception here is for Ubuntu, RHEL, and SLES, which it pulls from https://github.com/linux-on-ibm-z/PDS/tree/master/distro_data
  3. Ignore all of our data sources and scripts entirely and load up their own .json files (this may be the case if other architectures want to use the code, or if it's being used internally at an organization to search for software)
pleia2 commented 1 year ago

With #97 in the mix, we can expand upon this a bit to share: Whatever mechanism you use to generate your series of YAML files, once the files are in in the distro_data/data_files/ directory, they are then used to generate the tables in MySQL with the bin/database_build.py script.

aarishshahmohsin commented 3 months ago

Can I be assigned this issue?

rachejazz commented 3 months ago

Please send PR :D @aarishshahmohsin Also check few of them are already addressed in an outstanding PR