String slices must be integers.

sinkaroid / booru

Python bindings for booru imageboards

https://sinkaroid.github.io/booru

MIT License

25 stars 3 forks source link

String slices must be integers. #9

Closed rV4nxZ closed 1 year ago

rV4nxZ commented 1 year ago

Describe the bug try: images = [i["file_url"] for i in data]

except: images = [i["file"]["url"] for i in data]

Gives the error. This is yours interpreter.

To Reproduce Steps to reproduce the behavior:

Use parse_image(result) on: result = await r34.search_image("neko", limit=2) # test

Expected behavior Give me only the url.

Desktop (please complete the following information):

OS: Arch Linux
Python version: 3.10.10 and 3.11.3

Additional context No.

da-vinci-bot[bot] commented 1 year ago

Hey! @rV4nxZ You'll get a response soon! UwU

Click here to make your day UmU

![abc](https://gitlab.com/d0g/servant/-/raw/master/uwu/79cd7055195c3f660d06c038a5ade826.png "UmU")

sinkaroid commented 1 year ago

I tested on 3.10.11 everything works fine until some of my testing network marked as spam thus captcha were found on r34, but it should throwing some JSONDecodeError instead String slices must be integers Screenshot_574

Could you tell some moreinfo pls

rV4nxZ commented 1 year ago

hi, i have good news, i just discover that this:

async def main():
    r34 = Rule34()
    img = await r34.search(query="sexy", limit=1)
    img = json.loads(img)
    urls = [item['file_url'] for item in img if item['file_url'].startswith('https://api-cdn.rule34.xxx/images/')]
    print(urls)

asyncio.run(main())

parses the images correctly :D

rV4nxZ commented 1 year ago

it only works without the "gacha" tag, works with the "limit" tag

sinkaroid commented 1 year ago

When using gacha method, all returns only 1 object/dict, so your loop statements is no use then

rV4nxZ commented 1 year ago

fixed it

    img = await r34.search(query="sexy", gacha=True)

    if isinstance(img, str):
        # if img is a string, parse it as JSON
        data = json.loads(img)
        urls = [data['file_url']] if data['file_url'].startswith('https://api-cdn.') else []
    else:
        # if img is a list of image objects, extract the file urls
        data = json.loads(img)
        urls = [item['file_url'] for item in data if item['file_url'].startswith('https://api-cdn.')]

    print(urls)

now i'll try to add it to the parse_image function and use the resolve instead of data = json.loads(img) to follow your code

rV4nxZ commented 1 year ago

updated function:

def parse_image_r34(raw_object: dict):
    if isinstance(raw_object, str):
        data = json.loads(raw_object)
        urls = [data['file_url']] if data['file_url'].startswith('https://api-cdn.') else []
    else:
        data = json.loads(raw_object)
        urls = [item['file_url'] for item in data if item['file_url'].startswith('https://api-cdn.')]

output: ['https://api-cdn.rule34.xxx/images/6747/48766c19aac641cae347e3a4180e817a.png']

sinkaroid commented 1 year ago

You can always submit PR if the current flow is a bit clunky on your side

sinkaroid commented 1 year ago

fixed it

    img = await r34.search(query="sexy", gacha=True)

    if isinstance(img, str):
        # if img is a string, parse it as JSON
        data = json.loads(img)
        urls = [data['file_url']] if data['file_url'].startswith('https://api-cdn.') else []
    else:
        # if img is a list of image objects, extract the file urls
        data = json.loads(img)
        urls = [item['file_url'] for item in data if item['file_url'].startswith('https://api-cdn.')]

    print(urls)

now i'll try to add it to the parse_image function and use the resolve instead of data = json.loads(img) to follow your code

What is the point reinventing urls with loop, while img is only single object, there is still make sense if you re-validating them in normal search not gacha

rV4nxZ commented 1 year ago

well for me it works for what i need