Closed oijm17 closed 2 years ago
Are you using the --download-assets
and --download-captions
options?
Are you using the
--download-assets
and--download-captions
options?
Of course, yes, however I just found that the problem occurs in a specific situation, I have modified the description.
Are you using the
--download-assets
and--download-captions
options?Of course, yes, however I just found that the problem occurs in a specific situation, I have modified the description.
Would you mind posting some of the log?
Hi, I am experiencing the same issue. For me, it's generating some zero-byte HTML files instead of downloading the lecture resources.
Hi, I am experiencing the same issue. For me, it's generating some zero-byte HTML files instead of downloading the lecture resources.
Is there anything in the console output like an error?
Hi, I am experiencing the same issue. For me, it's generating some zero-byte HTML files instead of downloading the lecture resources.
Is there anything in the console output like an error?
Not as far as I could tell. I'll paste here a snippet from the log in a bit.
UPDATE 1: If I download with the --skip-lectures --download-assets
flags, nothing is downloaded and there's nothing of particular interest in the console:
Update 2: Using just he --download-assets
flag downloads zero-byte HTML files, instead of the actual assets. There's nothing of particular interest in the console in this case, either.
@Puyodead1, please let me know what information I could extract that might be useful to you in debugging this issue.
Thank you, this is very helpful. Would you mind sending me your bearer token and the course URL so I can do some testing? I don't know any courses I have with HTML file resources which is why I never was able to actually test it. You can email me puyodead@protonmail.com or DM me on Discord Puyodead1#001
Thank you, this is very helpful. Would you mind sending me your bearer token and the course URL so I can do some testing? I don't know any courses I have with HTML file resources which is why I never was able to actually test it. You can email me puyodead@protonmail.com or DM me on Discord
Puyodead1#001
Yeah, sure thing, but I think your Discord handle is missing a digit from the identifier. Just to note that the resource is PDF, not HTML, but for whatever reason it's not being picked up as such.
Thank you, this is very helpful. Would you mind sending me your bearer token and the course URL so I can do some testing? I don't know any courses I have with HTML file resources which is why I never was able to actually test it. You can email me puyodead@protonmail.com or DM me on Discord
Puyodead1#001
Yeah, sure thing, but I think your Discord handle is missing a digit from the identifier. Just to note that the resource is PDF, not HTML, but for whatever reason it's not being picked up as such.
ah damn github tried formatting it as a number and removed a digit lmao Puyodead#0001
and so its a PDF that is generating empty html files?
Thank you, this is very helpful. Would you mind sending me your bearer token and the course URL so I can do some testing? I don't know any courses I have with HTML file resources which is why I never was able to actually test it. You can email me puyodead@protonmail.com or DM me on Discord
Puyodead1#001
Yeah, sure thing, but I think your Discord handle is missing a digit from the identifier. Just to note that the resource is PDF, not HTML, but for whatever reason it's not being picked up as such.
ah damn github tried formatting it as a number and removed a digit lmao
Puyodead#0001
and so its a PDF that is generating empty html files?
Essentially, this is the page:
... but the PDF is not downloaded, and instead there's this empty HTML file.
@Puyodead1 I've emailed you, in the meantime. 😊
@Puyodead1 I've emailed you, in the meantime. 😊
I've identified the potential issue, could you please try the latest commit (8756bfc2668f688cb438db1570b7be7e57ab8cf7)
Also I noticed this specific course is rather large, if it makes it easier for testing, you can use --save-to-file
on the first run and then --load-from-file
on any further runs as additional arguments to reduce wait times from processing the course data
@Puyodead1 I've emailed you, in the meantime. 😊
I've identified the potential issue, could you please try the latest commit (8756bfc) Also I noticed this specific course is rather large, if it makes it easier for testing, you can use
--save-to-file
on the first run and then--load-from-file
on any further runs as additional arguments to reduce wait times from processing the course data
Nice! I will most likely have a look sometime tomorrow. Also, thanks for the pointer on saving to and loading from file.
Hi @Puyodead1, the PDFs are downloading fine now, but 0-byte HTML files are still being generated or downloaded too. It's not a massive problem, since I can just get rid of them all at the end, but it's indicative of an issue that may have additional implications.
UPDATE: So far, it seems to be just that one course, so it's possible that it may be structured in a different way from most other ones.
Hi @Puyodead1, the PDFs are downloading fine now, but 0-byte HTML files are still being generated or downloaded too. It's not a massive problem, since I can just get rid of them all at the end, but it's indicative of an issue that may have additional implications.
UPDATE: So far, it seems to be just that one course, so it's possible that it may be structured in a different way from most other ones.
That's odd, is it the same course you provided in your email? If so, could you tell me the exact command you're using (ofc censor any sensitive stuff). During my testing, I only downloaded assets and it didn't produce any html files
Hi @Puyodead1, the PDFs are downloading fine now, but 0-byte HTML files are still being generated or downloaded too. It's not a massive problem, since I can just get rid of them all at the end, but it's indicative of an issue that may have additional implications. UPDATE: So far, it seems to be just that one course, so it's possible that it may be structured in a different way from most other ones.
That's odd, is it the same course you provided in your email? If so, could you tell me the exact command you're using (ofc censor any sensitive stuff). During my testing, I only downloaded assets and it didn't produce any html files
Yes, it's that course. Apparently, if you only download assets then it doesn't reproduce, but if you use the full python main.py --course-url <Course URL> --download-assets
command, then I believe you should see the issue.
Hi @Puyodead1, the PDFs are downloading fine now, but 0-byte HTML files are still being generated or downloaded too. It's not a massive problem, since I can just get rid of them all at the end, but it's indicative of an issue that may have additional implications. UPDATE: So far, it seems to be just that one course, so it's possible that it may be structured in a different way from most other ones.
That's odd, is it the same course you provided in your email? If so, could you tell me the exact command you're using (ofc censor any sensitive stuff). During my testing, I only downloaded assets and it didn't produce any html files
Yes, it's that course. Apparently, if you only download assets then it doesn't reproduce, but if you use the full
python main.py --course-url <Course URL> --download-assets
command, then I believe you should see the issue.
Huh, okay. Could you send me a new bearer token for testing? If you prefer Discord, my correct tag is Puyodead1#0001
@Puyodead1 I'm not sure if I'm doing something wrong, but the handle doesn't seem to work. I'll email you another token.
@Puyodead1 I'm not sure if I'm doing something wrong, but the handle doesn't seem to work. I'll email you another token.
OH, haha. Puy❄dead1#0001
give that a try. 🤦🏻
@Xen0byte Resolved in bc9f6ecb1a40aa0aa5eaed66e9935a7d582f9a76 @oijm17 If you continue to have this issue, please open a new issue.
--download-assets ignores *.py assets, from this one
--download-assets ignores *.py assets, from this one
It doesn't ignore anything, if it's an attachment on a lecture it will be downloaded. If you have errors, make a new issue.
Description When you have already downloaded only the video files of the classes, and you run the script again, this time specifying the arguments to download all resources and all subtitles: The script does not download any of the attached resources and downloads only some subtitles from classes (very, very few).
To Reproduce
python main.py -c https://www.udemy.com/courses/myawesomecourse -b <Bearer Token>
--download-assets --download-captions -l all
arguments in order to complete the previous download with the attachments and subtitles, like this:python main.py -c https://www.udemy.com/courses/myawesomecourse -b <Bearer Token> --download-assets --download-captions -l all
Desktop: OS: [Windows 10, Linux Centos 8.5 Stream] Python: [v3.9.1, v3.6.8]