Jayapraveen / INE-courses-downloader

Python Script to download coures from INE.com website for personal and educational use
GNU General Public License v3.0
39 stars 19 forks source link

Video download - Token expired. Trying to refresh error #51

Open MatterSec opened 1 year ago

MatterSec commented 1 year ago

Error when downloading any course:

Please enter the number corresponding to the course you would like to download 235

Downloading: Windows Red Team Lab: 0%| | 0.00/1.00 [00:00<?, ?course/sNo access to video metadata; | 0.00/1.00 [00:00<?, ?videofile/s] Token expired. Trying to refresh .. 0%| | 0.00/1.00 [00:02<?, ?videofile/s] Downloading: Windows Red Team Lab: 0%| | 0.00/1.00 [00:02<?, ?course/s] Traceback (most recent call last): File "/Users/user/INE-courses-downloader/Ine.py", line 693, in downloader(course) File "/Users/user/INE-courses-downloader/Ine.py", line 574, in downloader out = get_meta(k["uuid"]) ^^^^^^^^^^^^^^^^^^^ File "/Users/user/INE-courses-downloader/Ine.py", line 269, in get_meta access_token_refetch() File "/Users/user/INE-courses-downloader/Ine.py", line 129, in access_token_refetch refresh_token = out["data"]["tokens"]["data"]["Refresh"]


KeyError: 'Refresh'

Does anyone know how to fix this?  I've done the other changes (thanks Ghost) to fix the refresh etc. but now hitting this error
grizzly94 commented 1 year ago

Same problem over here

ahaggard2013 commented 1 year ago

This seems to occur when the script can't retrieve the video's metadata to form a proper name for download. This happened on all the videos on the courses I tried to pull, so it's likely the structure changed at some point. If you just want the course slides/labs you can skip video downloads that have issues by changing the following:

Line 257 at the moment in the get_meta function

    elif(out.status_code == 403):
        #print("No access to video metadata;\nToken expired. Trying to refresh ..")
        #access_token_refetch()
        #print("Resuming operations..")
        return None
        return get_meta(uuid)

And move the try statement up at the end of the downloader function.

                                    out = get_meta(k["uuid"])
                                    try:
                                        out[0] = str(video_index) + '.' + out[0]
                                        pbar.set_description("Downloading: %s" %out[0])
                                        video_index = video_index + 1
                                        download_video(out[1],out[0])
                                        if(out[2]):
                                            download_subtitle(out[0],out[2])
                                    except:
                                        pass

I would make a PR, but this script seems to have a lot of small issues identified and requires more changes to work properly. If I have time I'll try and make it fully functional, but unfortunately don't atm

outtycast commented 1 year ago

you need to fix two things. first is

refresh_token = out["data"]["tokens"]["data"]["Refresh"]

change all instants to

refresh_token = out["data"]["tokens"]["data"]["Bearer"]


second you have to fix the headers. they've changed ;D

outtycast commented 1 year ago

replace this old meta function with this updated one

def get_meta(uuid): host = "video.rmotr.com" header = {"Host": host,"Origin": referer,"Referer": referer,"Authorization": access_token,"User-Agent": user_agent,"Accept": accept,"X-Requested-With": x_requested_with,"Accept-Encoding": accept_encodings,"sec-fetch-mode": sec_fetch_mode,"sec-fetch-dest": sec_fetch_dest,"Content-Type": content_type} out = requests.get(video_url.format(uuid),headers = header) if out.status_code == 200: out = json.loads(out.text) name = sanitize(out["title"]) subtitle,video,maxquality,nextquality = 0,0,0,0 for i in out["playlist"][0]["sources"]: try: if(i["height"] > maxquality): nextvideo = video video = i["file"] except: continue if (quality == 1): video = video else: video = nextvideo for i in out["playlist"][0]["tracks"]: if(i["kind"] == "captions"): subtitle = i["file"] out = [] out.append(name) out.append(video) if (subtitle): out.append(subtitle) return out elif(out.status_code == 403): print("No access to video metadata;\nToken expired. Trying to refresh ..") access_token_refetch() print("Resuming operations..") return get_meta(uuid)


new function

`def get_meta(uuid):

querystring = {
    "parent_type": "course",
    "parent_id": parent_id
}

host = "video.rmotr.com"
header = {"Host": host,"Connection": "keep-alive","Origin": referer,"Authorization": access_token,"User-Agent": user_agent,"Accept": accept,"Accept-Encoding": accept_encodings,"sec-fetch-site": "cross-site","sec-fetch-mode": sec_fetch_mode,"sec-fetch-dest": sec_fetch_dest}
out = requests.get(video_url.format(uuid),headers=header, params=querystring)
if out.status_code == 200:
    out = json.loads(out.text)
    name = sanitize(out["title"])
    subtitle,video,maxquality,nextquality = 0,0,0,0
    for i in out["playlist"][0]["sources"]:
        try:
            if(i["height"] > maxquality):
                nextvideo = video
                video = i["file"]
        except:
            continue
    if (quality == 1):
        video = video
    else:
        video = nextvideo
    for i in out["playlist"][0]["tracks"]:
        if(i["kind"] == "captions"):
            subtitle = i["file"]
    out = []
    out.append(name)
    out.append(video)
    if (subtitle):
        out.append(subtitle)
    return out
elif(out.status_code == 403):
    print("No access to video metadata;\nToken expired. Trying to refresh ..")
    access_token_refetch()
    print("Resuming operations..")
    return get_meta(uuid)

`

outtycast commented 1 year ago

add parent id to downloader function

change this:

def downloader(course): course_name = course["name"] if os.name == 'nt':

above course name add this - line 506 of code


def downloader(course):
   global parent_id
   parent_id = course["id"]

    course_name = course["name"]
    if os.name == 'nt':
outtycast commented 1 year ago

Downloading: Databases, Caching, & Big Data in AWS: 0% 0.00/5.00 [00:00<?, ?course/s] 0% 0.00/1.00 [00:00<?, ?videofile/s] Downloading: 1.vod-5370-databases-caching-and-big-data-in-aws-001.mp4: 0% 0.00/1.00 [00:00<?, ?videofile/s] Downloading: 1.vod-5370-databases-caching-and-big-data-in-aws-001.mp4: 100% 1.00/1.00 [00:01<00:00, 1.36s/videofile] Downloading: Databases, Caching, & Big Data in AWS: 40% 2.00/5.00 [00:01<00:04, 1.36s/course] 0% 0.00/1.00 [00:00<?, ?videofile/s] Downloading: 1.vod-5370-databases-caching-and-big-data-in-aws-002.mp4: 0% 0.00/1.00 [00:00<?, ?videofile/s] Downloading: 1.vod-5370-databases-caching-and-big-data-in-aws-002.mp4: 100% 1.00/1.00 [00:01<00:00, 1.38s/videofile]

0% 0.00/11.0 [00:00<?, ?videofile/s] Downloading: 1.vod-5370-databases-caching-and-big-data-in-aws-003.mp4: 0% 0.00/11.0 [00:00<?, ?videofile/s] Downloading: 1.vod-5370-databases-caching-and-big-data-in-aws-003.mp4: 9% 1.00/11.0 [00:01<00:13, 1.40s/videofile] Downloading: 2.vod-5370-databases-caching-and-big-data-in-aws-004.mp4: 18% 2.00/11.0 [00:02<00:12, 1.40s/videofile] Downloading: 2.vod-5370-databases-caching-and-big-data-in-aws-004.mp4: 27% 3.00/11.0 [00:03<00:07, 1.03videofile/s] Downloading: 3.vod-5370-databases-caching-and-big-data-in-aws-005.mp4: 45% 5.00/11.0 [00:03<00:05, 1.03videofile/s] Downloading: 3.vod-5370-databases-caching-and-big-data-in-aws-005.mp4: 55% 6.00/11.0 [00:04<00:03, 1.37videofile/s] Downloading: 4.vod-5370-databases-caching-and-big-data-in-aws-006.mp4: 73% 8.00/11.0 [00:05<00:02, 1.37videofile/s] Downloading: 4.vod-5370-databases-caching-and-big-data-in-aws-006.mp4: 64% 7.00/11.0 [00:06<00:03, 1.11videofile/s] Downloading: Databases, Caching, & Big Data in AWS: 20% 1.00/5.00 [00:09<00:36, 9.04s/course] Traceback (most recent call last): File "/content/test.py", line 720, in downloader(course) File "/content/test.py", line 598, in downloader download_lab(k["uuid"], lab_index) File "/content/test.py", line 321, in download_lab subfolder_name = 'Lab'+str(lab_index)+'.'+data["name"] KeyError: 'name'

means the key doesn't exist in the json

either find another option for the name of the folder or use try and except

on line 321 change this:

subfolder_name = 'Lab'+str(lab_index)+'.'+data["name"]

to this: (then it will use a default name when none exist in the json)

try: subfolder_name = 'Lab'+str(lab_index)+'.'+data["name"]

except: subfolder_name = 'Lab'+str(lab_index)+'.Lab'

outtycast commented 1 year ago

most likely its the file name or folder name has commas in it.

need to make a string replace to handle characters like !@#$%^& , or you'll get folder and filename errors.

`

//fix string filename

def fix_string_filename(string): forbidden = ["?", "|", ":", "<", ">", "\"", "*", "/"] for char in forbidden: if char in string: string = string.replace(char, "-") return string

---------------------------------------------------------------------------------------------------------------

//fix string foldername

def fix_string_foldername(string): forbidden = ["?", "|", ":", "<", ">", "\"", "*", "/", "\n", "\t", "\r", "\x00"] for char in forbidden: if char in string: string = string.replace(char, "") return string

`

useage

sub_folder = fix_string_foldername(sub_folder)

filename = fix_string_filename(filename)

outtycast commented 1 year ago

think your problem is the path after seeing message.

at the top after the set url variables add this (near line 99)

#//application path
mypath = os.getcwd()

then where your making the subfolder but before the file open to write the file; use mypath like this so it knows the full directory path

sub_folder = os.path.join(mypath, sub_folder)

dmaniip commented 1 year ago

@outtycast can you share the Ine.py please, having trouble to download the videos

Crazyhead90 commented 1 year ago

Updated the file to work again with the current INE website: https://github.com/Jayapraveen/INE-courses-downloader/pull/53

@dmaniip @yzbpzneith

Aval0n1 commented 1 year ago

@Crazyhead90 I managed to download using your commit changes. Thank you!

Aval0n1 commented 1 year ago

@Crazyhead90 Unfortunately subtitles are not being downloaded. With original script it worked, but the script downloaded only subtitles in spanish