skyl / corpora

Corpora is a self-building corpus that can help build other arbitrary corpora
GNU Affero General Public License v3.0
2 stars 0 forks source link

refactor(sync): some follow-up and cleanup to sync, namespace corpora.tasks #29

Closed skyl closed 2 weeks ago

skyl commented 2 weeks ago

PR Type

enhancement, tests


Description


Changes walkthrough πŸ“

Relevant files
Enhancement
corpus.py
Update import path for process_tarball function                   

py/packages/corpora/routers/corpus.py - Updated import path for `process_tarball` to `tasks.sync`.
+1/-1     
plan.py
Enhance issue creation guidelines in system message           

py/packages/corpora/routers/plan.py
  • Enhanced ISSUE_MAKER_SYSTEM_MESSAGE with additional guidelines for
    issue creation.
  • +4/-0     
    sync.py
    Enhance process_tarball and cleanup sync tasks                     

    py/packages/corpora/tasks/sync.py
  • Added corpus.save(update_fields=["updated_at"]) to process_tarball.
  • Removed commented-out code and unused tasks.
  • +4/-13   
    corpus.py
    Comment out debug print statements in sync command             

    py/packages/corpora_cli/commands/corpus.py
  • Commented out debug print statements for local and remote file hashes.

  • +5/-6     
    Tests
    test_corpus.py
    Update mock path for process_tarball in tests                       

    py/packages/corpora/routers/test_corpus.py - Updated mock path for `process_tarball` to `tasks.sync`.
    +2/-2     
    test_sync.py
    Add tests for Celery tasks in sync module                               

    py/packages/corpora/tasks/test_sync.py
  • Added new test cases for Celery tasks in sync.py.
  • Utilized mocking to simulate database interactions and task execution.

  • +107/-0 

    πŸ’‘ PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    github-actions[bot] commented 2 weeks ago

    PR Reviewer Guide πŸ”

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 πŸ”΅πŸ”΅πŸ”΅βšͺβšͺ
    πŸ§ͺ PR contains tests
    πŸ”’ No security concerns identified
    ⚑ Recommended focus areas for review

    Code Smell
    The `process_tarball` function directly updates the `updated_at` field of the `Corpus` model. Consider whether this logic should be encapsulated within the model itself to maintain separation of concerns. Code Smell
    The `ISSUE_MAKER_SYSTEM_MESSAGE` string is becoming quite lengthy and complex. Consider breaking it into smaller, more manageable parts or using a configuration file to maintain readability.
    github-actions[bot] commented 2 weeks ago

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Performance
    Verify the necessity of the database save operation to avoid unnecessary writes ___ **Ensure that the corpus.save method call is necessary and does not introduce
    unnecessary database writes, as it may impact performance.** [py/packages/corpora/tasks/sync.py [13]](https://github.com/skyl/corpora/pull/29/files#diff-7f230e9cbfe0834a9fcf3bd2f6623f008fc8cc6935fef3bc9e145d5e32f526fdR13-R13) ```diff +corpus.save(update_fields=["updated_at"]) - ```
    Suggestion importance[1-10]: 7 Why: Ensuring that the database save operation is necessary can prevent unnecessary writes, which is important for performance optimization. This suggestion is relevant and could have a significant impact on the application's efficiency.
    7
    Best practice
    Add a debug flag to conditionally print debug information for easier troubleshooting ___ **Consider adding a debug flag to conditionally print debug information, which can
    help in troubleshooting without cluttering the output in production.** [py/packages/corpora_cli/commands/corpus.py [79-81]](https://github.com/skyl/corpora/pull/29/files#diff-c04094e69a69881444f8b97d9fee7e0d29683f5e062cb4d2cf4530bc08daedf7R79-R81) ```diff -# TODO: debug flag! -# c.console.print("Local file hash map:") -# c.console.print(pformat(local_files_hash_map, width=80)) +if debug: + c.console.print("Local file hash map:") + c.console.print(pformat(local_files_hash_map, width=80)) ```
    Suggestion importance[1-10]: 6 Why: Adding a debug flag to control the printing of debug information is a useful enhancement for troubleshooting. It improves code maintainability and usability without affecting production output.
    6
    Enhancement
    Use a configuration file or environment variable to manage system message verbosity ___ **Consider using a configuration file or environment variable to manage the verbosity
    level of system messages, allowing for easier adjustments without modifying the
    code.** [py/packages/corpora/routers/plan.py [11-20]](https://github.com/skyl/corpora/pull/29/files#diff-d17ee5521ab86acd2823afbc993ced9ea8afb2f074db451ff29c56b83f4e33aaR11-R20) ```diff +ISSUE_MAKER_SYSTEM_MESSAGE = ( + "You are a highly skilled assistant specializing in creating well-structured, actionable issues " + "for this project. Your goal is to process the provided context and user input to generate " + "a concise and clear issue. This issue should include an informative title and a detailed body, " + "providing contributors with the necessary insights to effectively address the task. " + "Highlight relevant sections of the corpus and ensure that the issue is easy to understand and work on. " + "Do not be overly prescriptive, give viable choices for how it may be done but do not dictate the solution. " + "Talk about the goals and the problem to solve. " + "Be extremely terse and to the point. This is not a marketing pitch, it is a technical issue or task. " + "Don't use extra fluffy language." +) - ```
    Suggestion importance[1-10]: 5 Why: The suggestion to use a configuration file or environment variable for managing verbosity levels is a good practice for flexibility and maintainability. However, it is not directly addressing a critical issue in the current code, hence a moderate score.
    5
    skyl commented 2 weeks ago
    Screenshot 2024-11-10 at 8 01 57β€―PM