neomatrix369 / learning-path-index

A repo with data files, assets and code supporting and powering the Learning Path Index Project
MIT License
15 stars 16 forks source link

Google Cloud Skills Boost README update #82

Open asvcode opened 3 weeks ago

asvcode commented 3 weeks ago

updated some portions of the README

Summary by Sourcery

Enhance the Google Cloud Skills Boost scraper by improving error handling and refactoring the URL handling in the extract_ml_learning_path function. Update the README files to provide clearer instructions, including Windows-specific guidance. Modify the Activity model to support both integer and float durations.

Enhancements:

Documentation:

sourcery-ai[bot] commented 3 weeks ago

Reviewer's Guide by Sourcery

This pull request updates the Google Cloud Skills Boost README and makes improvements to the scraping functionality in the scrape_journey.py file. It also includes updates to the LLM POC variant README and lpiGPT.py file, addressing Windows compatibility and providing clearer instructions. Additionally, there's a minor type hint update in the models.py file.

File-Level Changes

Change Details Files
Improved error handling and flexibility in the extract_ml_learning_path function
  • Added error handling for missing journey details and links
  • Implemented graceful fallbacks for missing data
  • Added user input for GCSB_JOURNEY_URL
  • Improved CSV writing process with error handling
app/course-scraper/src/scrapers/google_cloud_skill_boost/scrape_journey.py
Updated README files with improved instructions and Windows compatibility
  • Added Windows-specific activation command for virtual environment
  • Updated directory navigation instructions
  • Corrected module paths for running scraper scripts
  • Added Windows-specific requirement for Microsoft Visual C++
  • Included instructions for setting OLLAMA_HOST environment variable
app/course-scraper/src/scrapers/google_cloud_skill_boost/README.md
app/llm-poc-variant-01/README.md
Minor updates to lpiGPT.py
  • Removed trailing whitespaces
  • Updated formatting for consistency
app/llm-poc-variant-01/lpiGPT.py
Updated type hint in models.py for better compatibility
  • Changed int | float to Union[int, float] for the duration field
app/course-scraper/src/scrapers/google_cloud_skill_boost/models.py

Sequence Diagram

sequenceDiagram
    participant User
    participant Function as extract_ml_learning_path
    participant Web as Web Request
    participant Parser as HTML Parser
    participant DataList as Data List

    User->>Function: Call with GCSB_JOURNEY_URL
    Function->>Web: Send GET request
    Web-->>Function: Return HTML content
    Function->>Parser: Parse HTML content
    Parser-->>Function: Return parsed DOM
    loop For each journey
        Function->>Function: Extract journey details
        Function->>Function: Handle missing data
        Function->>DataList: Append journey data
    end
    Function-->>User: Return data list

Tips - Trigger a new Sourcery review by commenting `@sourcery-ai review` on the pull request. - Continue your discussion with Sourcery by replying directly to review comments. - You can change your review settings at any time by accessing your [dashboard](https://app.sourcery.ai): - Enable or disable the Sourcery-generated pull request summary or reviewer's guide; - Change the review language; - You can always [contact us](mailto:support@sourcery.ai) if you have any questions or feedback.
asvcode commented 1 week ago

@neomatrix369 @TobeTek Ok I may need a walk through with this, I had created 1 branch and within that branch various commits to the various changes but I see now this is confusing because they are showing up within all the PRs I have submitted. Can one of you please let me know how best to resolve this. Each PR should be only the 1 change but because it is on the same branch it is showing up as duplicate. Thanks

TobeTek commented 1 week ago

Hi @asvcode , went ahead to restructure these PRs into #107 and #106 . Here's how I achieved that:

  1. Checkout the Pull Request branch: Since I don't have a local clone of your fork, I used this command to retrieve the changes from the PR

    git fetch origin pull/82/head:gcsb_readme_update
  2. Interactive Rebase to pick desired commits Using the rebase keywords and dialog we can pick the commits we'd like and rewrite the commit history, viz:

    git rebase -i d543b98~1  (Brief guide on rebasing)

I repeated this for each new PR I created.

asvcode commented 1 week ago

@TobeTek Perfect! Thank you for that!