Jessime / youtube_history

A quick analysis of all Youtube videos in a user's history.
MIT License
83 stars 4 forks source link

History only going back 1 year. #7

Open Lycelsara opened 6 years ago

Lycelsara commented 6 years ago

I'm using Windows, and everything appeared to run correctly bar the wordcloud, but once I reviewed my data, I saw that it only went back to March of 2017: 5062 videos ago. I was particularly interested in seeing an analysis of my entire history going back 7 years. Any clue of what went wrong, or how I could fix it?

Jessime commented 6 years ago

This is a new one.

A few questions:

  1. Are you sure you didn't get an error?
  2. If you look in the youtube_history/data/raw folder, do you only have a year of videos there? (i.e. is it the download that failed or something else).
  3. Are you sure you didn't change accounts or something last year?

Honestly, none of these are great questions, but I'm not immediately sure what's going on, so I'm covering the basics.

Lycelsara commented 6 years ago
  1. It said it was successful, as far as I saw. From the very beginning of it downloading, it only said there were 5062 videos (which is only the past year). Is there an error log I could check?
  2. Yep, only the 5062 videos.
  3. When I look at my history at https://www.youtube.com/feed/history it goes for 7 years, so I don't believe so.

I tried the same thing on another account (my sister's) and hers only went back three years (about 14,000 videos this time), when, again, she has a 7-year history on youtube.

Jessime commented 6 years ago

Ugh, this sounds like a bug in the youtube-dl library. It's weird that yours goes back a year and your sister's goes back three.

I'll try to poke around and figure out what's going on, but it'll probably be a while.

Lycelsara commented 6 years ago

Alright thanks, let me know if you find anything!

MacroPower commented 5 years ago

I'm having this issue too, except mine randomly works. About 33% of the time I get my full 10k+ history, the rest of the time I only get ~3000.

Jessime commented 5 years ago

I wonder if you're triggering two factor authentication (which does seem to happen at random). youtube-dl has done a lot of work on this recently. When was the last time you installed either this package or youtube-dl? If it's been a while, try:

pip install --upgrade -r requirements.txt

Side question: is there a particular reason your downloading your history multiple times? Once you have it, those files should be stored permanently.

MacroPower commented 5 years ago

My session was disconnecting (unrelated). Since I don't have screen set up it kills the process (seemingly). I just deleted and started over, then started noticing the issue.

I just pulled the deps a few days ago but I can try upgrading anyway...

Jessime commented 5 years ago

Ah, got it.

What I'm going to try is adding a request rate limiter option. I think if you wait a second or two between requests, you're probably a lot less likely to trigger whatever monitoring systems Google's running. I'll get that out in the next day or two and we can see what that does.

MacroPower commented 5 years ago

That makes a lot of sense. I was wondering how you were avoiding the captchas considering I do tend to run into them with similar applications.

Jessime commented 5 years ago

I don't know if it's a complete remedy, but I've added a --delay flag, which takes a float saying how many seconds to delay between requests. Maybe a value of ~2 would help?