Closed daehwankim112 closed 3 years ago
Hey @daehwankim112, thanks for the thoughtful write up and for giving the project a whirl. Sorry you're having so much trouble.
With regards to the "Unable to look up account info: HTTP Error 400: Bad Request"
error, using youtube-dl
to login to youtube has always been hit or miss. You can see lots of other people running into similar problems this year:
https://github.com/ytdl-org/youtube-dl/issues/23860
Interestingly, I've never downloaded data with takeout. I just tried it now and I'll see what happens when I run it with my data.
One thing I noticed while downloading is that the file type I downloaded was a .zip
file. Presumably you unzipped your download first (I assume you did, but always worth double checking)?
It looks like the UnicodeDecodeError
can be fixed pretty easily, according to stackoverflow. I'll give it a shot when my data downloads.
More importantly though, I assume that the data format from takeout is going to be different than the one my program expects. So even if we fix the current issue, it might require an update to get the json read in correctly.
I'll report back ASAP!
Hello Jessime. Sorry for late responding and thank you so much for taking your time into this.
I unzipped my file before running it.
More importantly though, I assume that the data format from takeout is going to be different than the one my program expects. So even if we fix the current issue, it might require an update to get the json read in correctly.
I agree.
I will wait. If there is anything you want me to do, let me know. Thank you.
Hey! One thing lead to another and I end up making a bunch of changes to this program over the last couple days. The most important one is that you can (and should) now specify the --takeout
parameter:
python youtube_history.py --takeout /path/to/Takeout
Note that the data you've downloaded in Takeout
is just a list of videos you've watched, but none of the information about the videos. So youtube_history.py
will still take a while to run the first time while it downloads the metadata for each video. This shouldn't be an issue, just make sure you stay connected to the internet.
Also, sometime in the last 4 years, Google stopped saving the likes/dislikes for each video, and just stores an "average rating". So, I redid a bunch of stats to account for that. I think the end results are even better than they were before!
One small caveat is that I haven't done a ton of testing with this new code yet, just tried it on a couple of people.
Give it a whirl and let me know how it goes!
Hello Jessime. Thank you for an update. I am so glad that you are working on it!
I pip installed it and downloaded my takeout and ran it. It says
Welcome!
usage: youtube_history.py [-h] [-o OUT] [-d DELAY]
youtube_history.py: error: unrecognized arguments: --takeout ../../Takeout
Then I found out there is a -h parameter and it says
Welcome!
usage: youtube_history.py [-h] [-o OUT] [-d DELAY]
optional arguments:
-h, --help show this help message and exit
-o OUT, --out OUT Path to empty directory for data storage.
-d DELAY, --delay DELAY
Time to wait between requests. May help avoid 2FA.
It does not recognize --takeout
parameter. I looked into your commit and it seems the parameter recognition part is not added. There is no change in youtube_history.py
Maybe you forgot to commit and push?
Thank you
The commit was pushed to GitHub 11 hours ago now.
So, all you need to do is git pull
the update on the master branch of your local checkout.
But, I'm pretty curious about what you mean when you say "I pip installed it". There's no pip package (that I've made at least). So just wondering what you mean by this?
I saw
Copy or clone this package from Github.
Open the Terminal/Command Line and navigate to where you copied the package:
$ cd path/to/copied/directory Then, just run:
$ pip install -r requirements.txt to install the dependencies.
in README.md There are two things added in requirements.txt so I thought I might need to install it again. And I get bunch of messages saying requirement already satisfied. I think it works?
I cloned it and ran it again. I am seeing the same result.
C:\Users\daehwan\Desktop\youtube history\youtube_history>python youtube_history.py --takeout /data
Welcome!
usage: youtube_history.py [-h] [-o OUT] [-d DELAY]
youtube_history.py: error: unrecognized arguments: --takeout /data
This is the issue I think what might be the cause but I am pretty sure I am wrong.
if __name__ == '__main__':
print('Welcome!'); stdout.flush()
parser = argparse.ArgumentParser()
parser.add_argument("-o", '--out', default='data',
help="Path to empty directory for data storage.")
parser.add_argument('-d', '--delay', default=0,
help='Time to wait between requests. May help avoid 2FA.')
args = parser.parse_args()
analysis = Analysis(args.out, float(args.delay))
analysis.run()
launch_web()
This is line 304-314 in youtube_history.py at the moment. There is no parser.add_argument for --takeout.
Thank you
🤦 that's because I pushed everything except the main file. Pretty dumb mistake on my part. Try pulling again!
Great! I ran it and it worked.
There were some errors that I had to fix by myself because of codec does not recognize my font and ignore error if video became only available for people who joined the channel.
For anyone who has cp949 error, (And I don't quiet recommend this method since you are changing lib directly. Maybe change back to what it was before after you get through it)
You've watched Korean video and codec does not recognize Korean because they uses cp949 instead of utf-8.
Go to C:\Users\your_name\Anaconda3\Lib Open pathlib.py and change read_text part (Line 1195) to this.
def read_text(self, encoding=None, errors=None):
"""
Open the file in text mode, read it, and close the file.
"""
with self.open(mode='r', encoding='UTF-8', errors='errors') as f:
return f.read()
Done
For those who receive this error.
Creating dataframe...
Traceback (most recent call last):
File "youtube_history.py", line 369, in <module>
analysis.run()
File "youtube_history.py", line 353, in run
self.start_analysis()
File "youtube_history.py", line 336, in start_analysis
self.check_df()
File "youtube_history.py", line 247, in check_df
self.df_from_files()
File "youtube_history.py", line 210, in df_from_files
data = [json.load(open(files.format(i))) for i in range(1, num + 1)]
File "youtube_history.py", line 210, in <listcomp>
data = [json.load(open(files.format(i))) for i in range(1, num + 1)]
FileNotFoundError: [Errno 2] No such file or directory: 'data\\raw\\17724.info.json'
The video you have watched is now only available for people who joined the channel.
Edit youtube_history.py to be
Line 166
line = p.stdout.readline().decode("utf-8", 'ignore').strip()
Line 192
line = p.stdout.readline().decode("utf-8", 'ignore').strip()
Done
For those who receive this error.
Welcome!
Creating dataframe...
Traceback (most recent call last):
File "youtube_history.py", line 369, in <module>
analysis.run()
File "youtube_history.py", line 353, in run
self.start_analysis()
File "youtube_history.py", line 336, in start_analysis
self.check_df()
File "youtube_history.py", line 247, in check_df
self.df_from_files()
File "youtube_history.py", line 210, in df_from_files
data = [json.load(open(files.format(i))) for i in range(1, num + 1)]
File "youtube_history.py", line 210, in <listcomp>
data = [json.load(open(files.format(i))) for i in range(1, num + 1)]
FileNotFoundError: [Errno 2] No such file or directory: 'data\\raw\\17724.info.json'
I don't know what happened. Maybe it crashed before it finishes downloading meta-data.
Edit youtube_history.py to be
Line 207
num = number of missing json file - 1
Done
For those who see rectangular boxes in wordcloud.
You are missing a font for whatever language you are using.
Download or locate font you want to use. Edit youtube_history.py to be Line 232
wordcloud = WordCloud(font_path='path/to/the/font',
width=1920,
height=1080,
relative_scaling=.5)
Then world is beautiful and everything works fine.
Thank you so much for this. It helped me understand what I am. I surely watched a lot of memes... I also watched 18,906 videos which shocked me how much time I spent on Youtube. If I spent time more wisely I might be at Harvard around now. This is an awesome project. Keep up the good work.
Hello. Thanks for the interesting project.
I am on Windows 10.
I am receiving
"Unable to look up account info: HTTP Error 400: Bad Request"
after I type in my credentials.I thought it is 2 years old project, the way things google handle might be changed and it may think it is a bot.
$ python youtube_history.py -d 1
didn't work as well.Therefore, I decided to get a metadata of my youtube history by myself. I went to https://takeout.google.com/ and got a json of my youtube history.
Then I get this error
I read through other issues and I found that if it is a second time running it, it will look for csv file. I think it might think is not a first time running because there is a metadata. So I used
$ python youtube_history.py -o /path/to/empty/data/directory/
in order to specify the location of metadata and it still look for csv.Here are the things that could be a cause of error.
I am very interested in this project. Thank you