otosky / medium_stats

Command Line and Python tool for Scraping Your Medium Stats
GNU General Public License v3.0
20 stars 5 forks source link

Canonical name of publications for credits matching, etc. #2

Closed eagereyes closed 4 years ago

eagereyes commented 4 years ago

When scraping a publication, it seems that the credits file is matched against the URL the user specifies, which might include the medium.com/ part or not. There should be a canonical name for a publication (IMHO just the part of the URL after https://medium.com/) that is used everywhere. That way, even if I specified the full URL, it would still match the creds based on just the canonical name.

otosky commented 4 years ago

You're totally right - I think I may just tie the credentials check back to the username from which the publications are associated, since those are the source of the uid/sid anyway.

That would make the CLI command for scraping a publication:

 medium-stats scrape_publication -u [USERNAME] -s [SLUG] --all

# where SLUG is the canonical name after medium.com/

I've made the changes under branch: https://github.com/otosky/medium_stats/tree/connect_scrape_pub_to_user

Let me know if that's a suitable adaptation. I think that would also accommodate issue #3 and issue #1, as well.

otosky commented 4 years ago

In this manner, the config file can also hold other usernames and associated cookies. That might be a really fringe case, though.

eagereyes commented 4 years ago

Yes, I think that's a great solution!

otosky commented 4 years ago

Changes applied in 154d55f73fe139c1e102570fadeb5184a77aea7e

Closing #3 and #1, as well!