elixir-cloud-aai / CrateGen

Compile GA4GH WES and TES requests from Workflow Run RO-Crates and package WES and TES runs as Workflow Run RO-Crates
Apache License 2.0
0 stars 1 forks source link

Discuss Project Design and Initial Version Proposal #13

Closed Karanjot786 closed 2 months ago

Karanjot786 commented 3 months ago

Summary

This issue is to discuss the overall design of the project and propose an initial version. The goal is to create a well-organized, maintainable, and scalable codebase.

Objectives

  1. Define Clear Objectives and Scope

    • Clearly define what the project aims to achieve.
    • Identify the scope of the project to prevent feature creep.
  2. Use Object-Oriented Design (OOP)

    • Encapsulation: Group related data and methods into classes.
    • Inheritance: Use inheritance to extend functionality.
    • Polymorphism: Design methods that can process objects differently based on their data type or class.
  3. Modular Architecture

    • Separation of Concerns: Break the project into distinct modules, each responsible for a specific functionality.
    • Reusability: Write modular code that can be reused across different parts of the project or in future projects.
  4. Design Patterns

    • Utilize common design patterns like Singleton, Factory, Strategy, Observer, etc., where appropriate to solve design problems in a standardized way.
  5. Define Models for TES and WES

    • Use Pydantic models to validate and manage TES and WES request payloads and responses.
    • Ensure proper error handling and data validation.

Proposed Project Structure

CrateGen/ ├── src/ │ ├── converters/ │ │ ├── tes_to_wrroc.py │ │ ├── wes_to_wrroc.py │ ├── models/ │ │ ├── tes_models.py │ │ ├── wes_models.py │ ├── utils/ │ │ ├── validation.py │ │ ├── formatting.py │ └── cli.py ├── tests/ │ ├── test_converters.py │ ├── test_models.py │ ├── test_utils.py ├── docs/ │ └── index.rst ├── .github/ │ └── workflows/ │ └── ci.yml ├── pyproject.toml ├── mypy.ini ├── README.md └── LICENSE

Key Components

  1. Models: Define data models for TES and WES using Pydantic.
  2. Converters: Implement the conversion logic in separate modules.
  3. Utils: Utility functions for validation and formatting.
  4. CLI: Command-line interface for the tool.
  5. Tests: Unit and integration tests for all components.
  6. Docs: Documentation files.
  7. CI/CD: GitHub Actions workflow for CI.

Tasks

  1. Define Models

    • Create Pydantic models for TES and WES.
  2. Implement Converters

    • Write functions to convert TES to WRROC and WES to WRROC.
  3. Set Up CLI

    • Create a CLI for the tool using Click or another framework.
  4. Write Tests

    • Implement unit and integration tests for all components.
  5. Documentation

    • Write comprehensive documentation for the codebase, APIs, and overall project.
  6. CI/CD Pipeline

    • Set up a CI/CD pipeline to automate testing and deployment.

Next Steps

  1. Discuss and Refine Design

    • Review the proposed design and provide feedback.
    • Refine the design based on feedback.
  2. Break Down Work into Small Work Packages

    • Define small, manageable work packages for implementation.
    • Ensure each work package includes tests and documentation.
  3. Start Implementation

    • Begin with the abstraction layer and proceed with the TES to WRROC conversion, followed by WES to WRROC.

Please provide your feedback and suggestions on this proposed design.

Thank you!

uniqueg commented 3 months ago

This looks very reasonable.

A few points:

Karanjot786 commented 3 months ago

I have renamed the src folder to CrateGen. Now, I am working on refactoring the conversion functions according to the feedback. This includes creating abstract classes for the converters and separating the library entry point from the CLI.

I'll update you once I have made significant progress on these changes.

uniqueg commented 3 months ago

Thanks a lot!

Please rename it to crategen (good name) and let's call the package the same way. By convention, capital letters should only be used in exceptional cases for package names.

We can still name the project and repo (I have already renamed it) CrateGen. Also make sure to update the title in the README.md.

You can do the renaming in a single PR.

Rest sounds good :)

Karanjot786 commented 3 months ago

Hi @uniqueg ,

I wanted to let you know that I’m going to open a PR shortly. This PR will include the initial project structure along with the implemented files. This is the initial version of the project and is not yet complete. Please review the structure and let me know if you have any feedback or suggestions for improvement.

Thank you!