Is your feature request related to a problem? Please describe.
We need a mechanism to generate comprehensive description of the output directory structure:
files
columns and their meaning.
Some considerations:
we may want to have a concise version to be shared with customers
be clear about the version number that support that output.
some fields that are added for internal engineers can also be labeled differently
we need to do that for both python/scala modules.
the generated files can be either markdown that are added to the repo. the documentation can simply link to the markdown files or it can include the md files (it is easy to convert markdown into restructuredtext format).
Additional context
On Scala, we can consider similar mechanism to the RAPIDS plugin is doing to generate supported-types. For example, we have a class ProfilerResult that can be used to crawl all the output types to generate the docs.
On python we can consider using schema validators/generators. We already use some of this functionalities in the user_tools module
The AutoGenerator engine can be used in the testing since having. For example if we have a defined schema for each output file, we can have a validator to check ll those files in our E-2-E. CC: @parthosa
Is your feature request related to a problem? Please describe.
We need a mechanism to generate comprehensive description of the output directory structure:
Some considerations:
Additional context