openzim / python-scraperlib

Collection of Python code to re-use across Python-based scrapers
GNU General Public License v3.0
19 stars 16 forks source link

set a user-agent for handle_user_provided_file #141

Closed aryanA101a closed 7 months ago

aryanA101a commented 7 months ago

Fix #103

"User-Agent" : "zimscraperlib/3.4.0-dev0 (contact+github@openzim.org)"

codecov[bot] commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 100.00%. Comparing base (538f8af) to head (f114d38).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #141 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 32 32 Lines 1393 1397 +4 Branches 240 240 ========================================= + Hits 1393 1397 +4 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

rgaudin commented 7 months ago

I think a versioned zimscraperlib UA is fine as long as it's easy to change in scrapers. Contact should be dev@openzim.org

aryanA101a commented 7 months ago

@benoit74 Sure, I'll keep it in mind.

aryanA101a commented 7 months ago

@benoit74 There is tight coupling in our context, which makes it impossible to test without significant refactoring.

aryanA101a commented 7 months ago

@benoit74 @rgaudin To make it such that UA can be changed.

def handle_user_provided_file(
    source: Optional[Union[pathlib.Path, str]] = None,
    dest: Optional[pathlib.Path] = None,
    in_dir: Optional[pathlib.Path] = None,
    nocopy: bool = False,  # noqa: FBT001, FBT002
    user_agent: Optional[str]=f"{PROJECT_NAME.replace(' ','/')} ({CONTACT})"
)
benoit74 commented 7 months ago

There is tight coupling in our context, which makes it impossible to test without significant refactoring.

In one (or more) test of handle_user_provided_file, you can use pytest monkeypatch on stream_file and assert that proper header (User-Agent) is passed. No refactoring needed.

To make it such that UA can be changed, it requires a breaking change.

I don't get where you see a breaking change, could you develop your concern?

aryanA101a commented 7 months ago

In one (or more) test of handle_user_provided_file, you can use pytest monkeypatch on stream_file and assert that proper header (User-Agent) is passed. No refactoring needed.

Okay, I was unaware of monkey patching.

I don't get where you see a breaking change, could you develop your concern?

Nahh, it's alright. There was a bitflip in my brain.

benoit74 commented 7 months ago

Side note, @aryanA101a you need to update your first comment to write Fix #xxx (instead of #xxx) so that corresponding issue is automatically closed once the PR is merged (https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue)

benoit74 commented 7 months ago

@rgaudin final review is welcomed