6fe9d454 / bdsmlr-scripts

Scripts for downloading blogs, likes, etc from BDSMLR in lieu of a functioning API.
2 stars 1 forks source link

list index out of range #2

Closed varadins closed 2 years ago

varadins commented 2 years ago

I am using the following command and receive the following error. Am I providing variables incorrectly?

➜ bdsmlr-scripts-master python3 bdsmlr_get_blog_fast.py -u legitemail@gmail.com -p http://unparalleled.bdsmlr.com

[http://unparalleled] No tags set, grabbing all posts [http://unparalleled] Logged in ... Traceback (most recent call last): File "bdsmlr_get_blog_fast.py", line 201, in main(args) File "bdsmlr_get_blog_fast.py", line 98, in main true_end_page = list(map(int, filter(str.isnumeric, page_numbers)))[-1] IndexError: list index out of range

varadins commented 2 years ago

@6fe9d454 I know you aren't actively developing this anymore, but could you tell me if this is an indicator that my command line arguments are bad?

6fe9d454 commented 2 years ago

I am using the following command and receive the following error. Am I providing variables incorrectly?

➜ bdsmlr-scripts-master python3 bdsmlr_get_blog_fast.py -u legitemail@gmail.com -p http://unparalleled.bdsmlr.com

[http://unparalleled] No tags set, grabbing all posts [http://unparalleled] Logged in ... Traceback (most recent call last): File "bdsmlr_get_blog_fast.py", line 201, in main(args) File "bdsmlr_get_blog_fast.py", line 98, in main true_end_page = list(map(int, filter(str.isnumeric, page_numbers)))[-1] IndexError: list index out of range

I think I see the problem. The page indices at the bottom, the >> symbol is included mistakenly in the page-item list. I'll add something to remove all non-numerics. It's odd though because there is already a filter for str.isnumeric on the list, so it shouldn't be making it's way in there anyway.

6fe9d454 commented 2 years ago

So I just tried the blog linked, and it works fine, are you sure you're supplying everything correctly, e.g.:

$ python bdsmlr_get_blog_fast.py -u username -p password https://unparalleled.bdsmlr.com

You may try wrapping the username/password each in quotes as well. That could be the cause for why it's unable to find the proper end page.

varadins commented 2 years ago

Thank you for looking at this, and I'm glad to hear I'm the problem =) I was running it with python3 and was getting further.

Using the command variations below:

python bdsmlr_get_blog_fast.py -u 'email@gmail.com' -p 'password' https://unparalleled.bdsmlr.com or python bdsmlr_get_blog_fast.py -u "email@gmail.com" -p "password" https://unparalleled.bdsmlr.com or python bdsmlr_get_blog_fast.py -u email@gmail.com -p password https://unparalleled.bdsmlr.com

I receive:

  File "bdsmlr_get_blog_fast.py", line 37
    f'[{blog_name}] {tag_method.upper()}ing posts with tags: {", ".join(tags)}',
                                                                              ^
SyntaxError: invalid syntax

python --version Python 2.7.18

varadins commented 2 years ago

@6fe9d454 In fact, the error happens without any variables being passed. Is this an issue caused by a different python version? Currently running Python 2.7.18

varadin@host:~/projects/bdsmlr-scripts-master$ python bdsmlr_get_blog_fast.py
  File "bdsmlr_get_blog_fast.py", line 37
    f'[{blog_name}] {tag_method.upper()}ing posts with tags: {", ".join(tags)}',
                                                                              ^
SyntaxError: invalid syntax
varadins commented 2 years ago

Apparently f lines weren't added until 3.6. Back to the original error running Python 3.9.2

varadin@penguin:~/projects/bdsmlr-scripts-master$ python3 bdsmlr_get_blog_fast.py -u dmm.images@gmail.com -p password https://unparalleled.bdsmlr.com
[unparalleled] No tags set, grabbing all posts
[unparalleled] Logged in ...
Traceback (most recent call last):
  File "/home/varadin/projects/bdsmlr-scripts-master/bdsmlr_get_blog_fast.py", line 199, in <module>
    main(args)
  File "/home/varadin/projects/bdsmlr-scripts-master/bdsmlr_get_blog_fast.py", line 96, in main
    true_end_page = list(map(int, filter(str.isnumeric, page_numbers)))[-1]
IndexError: list index out of range
varadins commented 2 years ago

Added some print functions to display data:

    true_end_page = 0
    page = html.fromstring(session.get(f"{url}/archive", timeout=300).text)
    page_numbers = page.xpath('//li[@class="page-item"]/a/text()')
    print("true end page 0? =", true_end_page)
    print("page =", page)
    print("page_numbers =", page_numbers)
    true_end_page = list(map(int, filter(str.isnumeric, page_numbers)))[-1]

Results in:

[http://unparalleled] No tags set, grabbing all posts
[http://unparalleled] Logged in ...
true end page = 0
page = <Element html at 0x7db23a67b540>
page_numbers = []
Traceback (most recent call last):
  File "/home/varadin/projects/bdsmlr-scripts-master/bdsmlr_get_blog_fast.py", line 203, in <module>
    main(args)
  File "/home/varadin/projects/bdsmlr-scripts-master/bdsmlr_get_blog_fast.py", line 99, in main
    true_end_page = list(map(int, filter(str.isnumeric, page_numbers)))[-1]
IndexError: list index out of range
varadins commented 2 years ago

It couldn't find the class 'page-item' because I had to disable infinite scroll for archive.

lol didn't know that was a prerequisite.

6fe9d454 commented 2 years ago

Sorry, just saw this. Yeah it requires infinite scrolling to be disabled on the archive pages unfortunately. Forgot about that. I'll add a note to the read me.