sungchun12 / fst

fst: flow state tool | smooth where you want it, friction where you need it when data engineering
https://www.loom.com/share/ecfbdfb981e4443d94d2c95f16176118
Apache License 2.0
32 stars 1 forks source link

Blueprint general design #1

Open sungchun12 opened 1 year ago

sungchun12 commented 1 year ago

To better organize the given script into modular components, separate files where needed, and make it fully performant and maintainable, I would recommend the following structure:

  1. main.py: This file will include the main entry point, command-line functionality using Click, and import functions from other modules.
  2. logger.py: This file will include the logging setup and related functions.
    • setup_logger()
  3. query_handler.py: This file will contain the class and related functions for handling file events and executing queries.
    • class QueryHandler(FileSystemEventHandler)
    • handle_query(query, file_path)
  4. file_utils.py: This file will include all file-related utility functions.
    • get_active_file(file_path: str)
    • find_compiled_sql_file(file_path)
    • get_model_name_from_file(file_path: str)
    • generate_test_yaml(model_name, column_names, active_file_path)
    • get_model_paths()
  5. db_utils.py: This file will include all database-related utility functions.
    • execute_query(query: str, db_file: str)
    • get_duckdb_file_path()
    • get_project_name()
  6. directory_watcher.py: This file will include functions related to watching the directory for changes.
    • watch_directory(directory: str, callback, active_file_path: str)
# main.py
import click
import os
from logger import setup_logger
from query_handler import handle_query
from directory_watcher import watch_directory
from file_utils import get_active_file, get_model_paths

@click.group()
def main():
    pass

@main.command()
# ... (the remaining Click commands as you provided)

if __name__ == "__main__":
    main()

This modular approach will make the code more maintainable, as each module focuses on a specific functionality. Additionally, it allows for easier testing and optimization of individual components, leading to improved performance.

sungchun12 commented 1 year ago

Think through file, class, method architecture so it's not in all one messy script