Curts0 / PyTabular

Connect to Tabular Models via Python
https://curts0.github.io/PyTabular/
MIT License
67 stars 11 forks source link

Documentation Generation #61

Open Daandamhuis opened 1 year ago

Daandamhuis commented 1 year ago

I've added a ModelDocumenter class which enables generating markdown files based on the Docusaurus notation for rendering the pages. It's not neat yet, but I will address that in the coming week. Is it easier to put the documentation in a submodule/subfolder?

@Curts0 What is easier, keep everything in the root folder of pytabular or create a documentation subfolder with all specific doc generation functions and classes?


import pytabular
import logging

logger = logging.getLogger("PyTabular")
model = pytabular.Tabular(f"{SERVER};Catalog={INITIAL_CATALOG}")

docs = pytabular.ModelDocumenter(model)
docs.save_documentation()

To-do:

Curts0 commented 1 year ago

Hey Daan,

Sorry just getting to this. I think a subfolder/submodule would be good. That would give users the flexibility to import documentation items separately when they want to. Open to whatever feels best though.

Daandamhuis commented 1 year ago

@Curts0 : I’m struggling a bit with the “Make Page Setup Dynamic”. I want to make it as easy as possible to alter the fields shown in the documentation. Like hide “Available in MDX”.

Would it be an idea to setup a template YAML file which can be used to override the existing fields? Or another sort of config that could be supplied with the generation process.

I’m currently working on implementing this in our DevOps Pipelines, so that when we deploy the docs site, the docs get automatically updated by a simple script. Would it be an idea to write something about it? Since I'm working on my website again I could (also) post it their and I could also do other write-ups also?

Lastly I will create an example section to display the Adventure Works Documentation.

Daandamhuis commented 1 year ago

And on other thing. Do you have an example MKDocs file that I can try to generate?

Curts0 commented 1 year ago

I think running your ModelDocumenter in DevOps is the perfect use case. I'd honestly like to get that implemented for my own models too. Runing the pipeline automatically post a deployment would be great. There is so much metadata hidden from the end user when they are in PBI, so having that documentation will help a lot.

I do think a yaml file is the right way to go, as long as it isn't mandatory, so it can work with or without it. Ideally all user has to do is right a few lines of python to see default documentation (which you already have with two simple generate_documentation_pages() and save_documentation()). But then if desired, the user could easily read some articles to really customize what they want with a yaml file or some arguments in the code. The more documentation/articles the better.

PyTabular has a GitHub action that runs mkgendocs to create the documentation from our docstrings. Then mkdocs to deploy it.

To see it locally you should be able to run:

gendocs --config mkgendocs.yml
mkdocs serve

This will load a local site of the documentation. I'm not completely satisfied with this approach and might change it later, but it works for now.

Those both have their own yaml file that I work with. The mkgendocs.yml tells what should be documented. Then the mkdocs.yml builds out the site with the pages.

Daandamhuis commented 1 year ago

@Curts0 : I've got a pipeline running at work.

Step 1

  1. Create a Job with windows-latest (afaik it won't run on ubuntu)
  2. Configure Python and Install PyTabular.
  3. Then Execute a local script that accepts command line arguments which control the model that will be deployed.
  4. Publish generated content to a Pipeline Artifact.

Step 2

  1. Create a second job with ubuntu-latest (Requirement for the Build and deploy an Azure Static Web App)
  2. Download Pipeline Artifact
  3. Generate HTML from the markdown files and Publish the build output to the static web app.

image

Local Script

import pytabular
import getopt
import sys
import logging

logger = logging.getLogger("PyTabular")

argument_list = sys.argv[1:]

logger.info(sys.argv)

# Options
options = "s:m:u:p:w:c:d:t:"

# Long options
long_options = [
    "server=",
    "model=",
    "user=",
    "password=",
    "workspace=",
    "catalog=",
    "docs=",
    "translations=",
]

# Parameters to define for connections
SERVER_NAME, WORKSPACE_NAME, USER_ID, PASSWORD, MODEL_NAME, CONN_STR = (
    None,
    None,
    None,
    None,
    None,
    str(),
)

# Parameters to define for Docs.
DOC_LOCATION = "docs"
SELECTED_CULTURE = "en-US"
USE_TRANSLATIONS = False

try:
    # Parsing argument
    arguments, values = getopt.getopt(argument_list, options, long_options)

    # checking each argument
    for current_argument, current_value in arguments:
        if current_argument in ("-s", "--server"):
            SERVER_NAME = current_value
        if current_argument in ("-m", "--model", "-c", "--catalog"):
            MODEL_NAME = current_value
        if current_argument in ("-w", "--workspace"):
            WORKSPACE_NAME = current_value
        if current_argument in ("-u", "--user"):
            USER_ID = current_value
        if current_argument in ("-p", "--password"):
            PASSWORD = current_value
        if current_argument in ("-d", "--docs"):
            DOC_LOCATION = current_value
        if current_argument in ("-t", "--translations"):
            USE_TRANSLATIONS = current_value == "Yes"

    if SERVER_NAME is not None and MODEL_NAME is not None:
        CONN_STR = f"Provider=MSOLAP;Data Source={SERVER_NAME}"

        if WORKSPACE_NAME is not None:
            CONN_STR = f"{CONN_STR}/{WORKSPACE_NAME}"
        if USER_ID is not None:
            CONN_STR = f"{CONN_STR};User ID={USER_ID}"
        if PASSWORD is not None:
            CONN_STR = f"{CONN_STR};Password={PASSWORD}"

        CONN_STR = f"{CONN_STR};Catalog={MODEL_NAME}"

    else:
        logger.warning("Arguments -m (--model) and -s (--server_name) are needed")
        logger.warning(f"Server Name: {SERVER_NAME} > Model Name: {MODEL_NAME}")
        CONN_STR = None

except getopt.error as err:
    # output error, and return with an error code
    logger.warning(err)

if CONN_STR is not None:
    # Connect to a Tabular Model Model
    model = pytabular.Tabular(CONN_STR)

    # Initiate the Docs
    docs = pytabular.ModelDocumenter(
        model=model, save_location=f"{DOC_LOCATION}\\data-models"
    )

    # Set the translation for documentation to an available culture.
    # By setting the Tranlsations to `True` it will check if it exists and if it does,
    # it will start using the translations for the docs
    if USE_TRANSLATIONS:
        docs.set_translations(enable_translations=True, culture=SELECTED_CULTURE)

    # # Generate the pages.
    docs.generate_documentation_pages()

    # # Save docs to the default location
    docs.save_documentation()
else:
    logger.warning(f"Connection String isn't correctly setup >> {CONN_STR}")
Daandamhuis commented 1 year ago

@Curts0 What would you need for an "Example docs" on the docs site? 😃

Curts0 commented 1 year ago

@Daandamhuis whatever you got haha I'll happily get it in.

I switched over to: mkdocstrings instead of mkgendocs. It feels more intuitive and flexible. But I've also started exploring some of pymdown extensions to make the markdown files look a lot better.

All you would need to do is add your markdown file PyTabular\docs\ and update the nav part of the mkdoc.yml file.

site_name: PyTabular
site_description: "Connect to your Tabular models in Python!"
site_url: https://curts0.github.io/PyTabular/
docs_dir: docs
repo_name: Curts0/PyTabular
repo_url: https://github.com/Curts0/PyTabular
nav:
    - Home: README.md
    - Main Tabular Class: Tabular.md
    - Query Model: query.md
    - Refresh Model: refresh.md
    - PyObject Reference:
      - PyObjects: PyObjects.md
      - PyObject: PyObject.md
      - PyTables: PyTables.md
      - PyTable: PyTable.md
      - PyColumns: PyColumns.md
      - PyColumn: PyColumn.md
      - PyPartitions: PyPartitions.md
      - PyPartition: PyPartition.md
      - PyMeasures: PyMeasures.md
      - PyMeasure: PyMeasure.md
    - Misc. File Reference:
      - tabular_editor: tabular_editor.md
      - best_practice_analyzer: best_practice_analyzer.md
      - pbi_helper: pbi_helper.md
      - logic_utils: logic_utils.md
    - Running Traces: tabular_tracing.md
    - Documenting Model: document.md
    - Contributing: CONTRIBUTING.md

You can run mkdocs serve in your console to see it locally before submitting anything.

These three packages I think are the only ones needed to have it serve locally.

On a somewhat related note: The nice thing is mkdocstrings will pick up when we are using our classes with type hinting. Example of your ModelDocumenter class naturally linking to Tabular class from your docstring. image

Daandamhuis commented 1 year ago

@Curts0 Is it possible to have a .get('measure_name_xyz', 'Alternative Result') option or is this already possible? Same as it how you use it with a dict.

This would help a lot when using the "yaml" config.

Curts0 commented 1 year ago

@Daandamhuis Do you mean calling a pyobject from a pyobjects? Like model.Tables['Sales Fact'] or model.Measures['Total Sales']. If you need something right away. You could use the magic method the dict is calling.

import pytabular as p
model = p.Tabular(CONNECTION_STR)

model.Measures['Total Sales']

###### Is the same as.

model.Measures.__getitem__('Total Sales')

Right now, it will just error if nothing is found.

PyObject getitem

Daandamhuis commented 1 year ago

I've made a small .get("Search Term", "Alternative Result"). The downside currently is that it will return a totally different response the the getitem. Is there a way I can create the alternate result so it does work?

Current Result

image

Addition

def get(self, object_str: str, alt_result: str = '') -> str:
    """Gets the object based on str.

    If the object isnt found, then an alternate result
    can be supplied as an argument.

    Args:
        object_str (str): str to lookup object
        alt_result (str): str to return when value isn't found.

    Returns:
        str: Result of the lookup, or the alternate result.
    """
    try:
        return self.__getitem__(object_str)
    except Exception as e:
        return alt_result