JohnDavid07 commented 4 years ago

err

JohnDavid07 commented 4 years ago

``exclusions = ['__MACOSX/']

destination = "/content/drive/My Drive/" download_tasks = [ { 'folder': 'gdppci', 'url': 'https://........................workers.dev/0:/..................................' (something private url) }, ]

print('##################################') print('# Crawling all downloadable urls #') print('##################################', end='\n\n') tasks = [] for task in download_tasks: tasks += crawler_v2(task['url'], [], os.path.join(destination, task['folder']), 0, exclusions, verbose=False)

print(json.dumps(tasks, indent=2), end='\n\n')

total_size = get_filesize(sum([int(task['size']) for task in tasks]))

print(json.dumps(tasks, indent=2)) print('\nTotal Task:', len(tasks)) print('Total size: %.3fGB' % total_size, end='\n\n')``

JohnDavid07 commented 4 years ago

Can you please give/post a guide on how to use.

NullBruce commented 4 years ago

@atlonxp can you please look into this? i can't find the problem with "tasks"

atlonxp commented 4 years ago

@JohnDavid07 @NullBruce could you provide me the goindex link I will try when I have time

NullBruce commented 4 years ago

@atlonxp literally any link.

Crawling all downloadable urls # ##################################

https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/ retry #2 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/ retry #3 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/ retry #4 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/ retry #5 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/

Data is missing! change a plan -
use terminal CURL - Nah, something went wrong!

JSONDecodeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose) 55 response = os.popen("curl --globoff {} -d ''".format(url.geturl())).read() ---> 56 response_json = json.loads(response) 57 except Exception as e: 4 frames JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) in crawler_v2(url, downloading_dict, path, level, exclusions, verbose) 57 except Exception as e: 58 print('Nah, something went wrong!') ---> 59 print(e.args()) 60 return [] 61 except Exception as e: TypeError: 'tuple' object is not callable

atlonxp commented 4 years ago

Huh! You don't seem to aware that tutflix (aka tutnetflix) has been banned from Cloudflare. The links you provided were not available long ago.

Easy way to check if the link working is to visit the GoIndex website.

if it displays its contents --> it is working
if it does not display anything, just loading progress toolbar --> not working at all.

NullBruce commented 4 years ago

@atlonxp here's a link that doesn't work, also i tried with multiple ones that are up.

#################################

Crawling all downloadable urls

##################################

https://manga.td-index.workers.dev/0:/ retry #2 https://manga.td-index.workers.dev/0:/ retry #3 https://manga.td-index.workers.dev/0:/ retry #4 https://manga.td-index.workers.dev/0:/ retry #5 https://manga.td-index.workers.dev/0:/

Data is missing! change a plan -
use terminal CURL - Nah, something went wrong!

JSONDecodeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose) 55 response = os.popen("curl --globoff {} -d ''".format(url.geturl())).read() ---> 56 response_json = json.loads(response) 57 except Exception as e: 4 frames JSONDecodeError: Expecting value: line 1 column 1 (char 0) During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) in crawler_v2(url, downloading_dict, path, level, exclusions, verbose) 57 except Exception as e: 58 print('Nah, something went wrong!') ---> 59 print(e.args()) 60 return [] 61 except Exception as e: TypeError: 'tuple' object is not callable

Rudo2204 commented 4 years ago

@NullBruce Acrous index is not yet supported. See #7 I tried to poke around a bit but nothing seems to work :(

atlonxp commented 4 years ago

@Rudo2204 @NullBruce i need to have a look around how Acrous working. I think it is just a theme but there might as well be some script for a dynamic content generation (which is causing the problem)

atlonxp / recursive-goIndex-downloader

Errors while running cell5 In V2 #10

print(json.dumps(tasks, indent=2), end='\n\n')

Crawling all downloadable urls