JupiterBroadcasting / show-scraper

Scraper written in python to convert episodes hosted on Fireside or jupiterbroadcasting.com into Hugo Markdown files
5 stars 5 forks source link

Feat/config map mod #28

Open elreydetoda opened 2 years ago

elreydetoda commented 2 years ago

accidentally closed #24 , so re-opening this with the changes added back... (reference in https://github.com/JupiterBroadcasting/show-scraper/issues/20)

gerbrent commented 2 years ago

Just looking for reviewers on this before merging..

ChanceM commented 2 years ago

@elreydetoda I didnt see you had an open PR for this. My function came out a little different.

def get_username_from_url(url):
    """
    Get the last path part of the url which is the username for the hosts and guests.
    Replace it using the `username_map` from config.
    """
    username = urlparse(url).path.split("/")[-1]
    usernames_map = config.get("usernames_map")

    # Replace username if found in usernames_map or default to input username
    return next((key for key, list in usernames_map.items() if username in list), username)

Also of course I have tests for this I can contribute as well after this is merged if you want.

image

elreydetoda commented 2 years ago

My only hesitancy with your solution (which is definitely more technically elegant IMO) is people have to understand generator comprehension. While my solution is more simplistic, because people only have to understand a for loop.

I also thought I'd made mine return as soon as it found a match (probably did on #24 but forgot to on this one 🤦‍♂️), so it cuts off the loop immediately. (I'll make that modification now) That way we're not spending the extra cycles (however infinitesimal they are) looking through the rest of the dict_items (which yours does the same as mine does currently, where it continues to loop after it's found a match).


I do think when we re-write this we'll have to introduce more technically elegant solutions to keep complexity down overall, but for now keeping things more explict and simpler concepts makes it easier for others to contribute.

elreydetoda commented 2 years ago

@ChanceM, what's your opinion on my comment above? Do you think it makes sense or am I just being too trivial?

ChanceM commented 2 years ago

@ChanceM, what's your opinion on my comment above? Do you think it makes sense or am I just being too trivial?

Well since this is a community driven project encouraging participation is definitely a worthy goal. As a counterpoint with additional comments as you have suggested in other places it could also be a way of teaching new concepts to the community.

Also your right I should have thrown a filter in there to stop the execution.

elreydetoda commented 2 years ago

As a counterpoint with additional comments as you have suggested in other places it could also be a way of teaching new concepts to the community.

Ya...that's a fair point...how would you document that if we went with your solution?

Also your right I should have thrown a filter in there to stop the execution.

If you can do this with relative easy (especially since a generator is already a lazy comprehension IIRC) and the docs are clear enough, then how about we go with your solution.

We could have this PR merged and then you can make a PR to refactor it with the generator comprehension instead of the for loop. That way people can see a simpler way in the history if they want to.

ChanceM commented 2 years ago

This gives a pretty good explanation List Comprehensions in Python and Generator Expressions.

next(filter(str.__instancecheck__,(key for key, list in usernames_map.items() if username in list)), username)

Oh I do like the idea of stacking the changes for history. But either way yours or mine works, I'll still provide the tests if you think they might be valuable.

elreydetoda commented 2 years ago

Ya, that us a pretty good resource! Also, definitely i think either way the tests would absolutely be of value 😁 it would also be a good example for people if/when we swap out our code (since it should do the same thing) that the tests should pass either way without modification. 😁

Also...interesting I've never used a filter before, I'll have to read up on that 😅

elreydetoda commented 2 years ago

Also, @ChanceM if you feel comfortable enough and have time, feel free to review my PR(s) for this repo. I don't know if @gerbrent is just looking for any reviews or specifically @kbondarev , but either way it would hurt to have a second set of eyes for things.

I know you've already at least glanced at this one, but at least the other one I have open. I'm planning on getting to your guest one that's open for this repo, just postulating if that the best way forward or if there's another way.

elreydetoda commented 2 years ago

Just need an approval on this before merging it. Then it'll let @ChanceM be able to improve it with his implementation.