venthur / immoscrapy

Scrape Immobilienscout24 data using Python
MIT License
6 stars 1 forks source link

Getting response 401 #8

Closed Lama09 closed 2 years ago

Lama09 commented 2 years ago

Hi Bastian, many thanks for providing this package. Unfortunatly as soon as I do run a different query than this one here

ult = immoscrapy.query('de','berlin', 'berlin', 'HOUSE_BUY',price=900000)

like for example this one here

ult = immoscrapy.query('de','chemnitz', 'chemnitz', 'HOUSE_BUY',price=900000)

I will recieve a 401 response from the post() request. Do I need to have any authorizing to use this API?

BR Lama

venthur commented 2 years ago

Hi thanks for the report, unfortunately I cannot reproduce this bug. You don't need any authentication for this library to work. I've added a regression test for the case you provided to ensure this example works. I'll leave the report open in case someone can provide more info.

Lama09 commented 2 years ago

Hi Bastian, I downloaded your latest version 1.0.0 and run your testfunction _test_regression_gh8() unfortunatly with the same problem. This was the output.

Connected to pydev debugger (build 201.8743.20) 2022-04-30 22:31:06,271 INFO immoscrapy.immoscrapy Using URL: https://www.immobilienscout24.de/Suche/de/chemnitz/chemnitz/haus-kaufen?sorting=2&price=900000 2022-04-30 22:31:06,281 DEBUG urllib3.connectionpool Starting new HTTPS connection (1): www.immobilienscout24.de:443 2022-04-30 22:31:06,670 DEBUG urllib3.connectionpool https://www.immobilienscout24.de:443 "POST /Suche/de/chemnitz/chemnitz/haus-kaufen?sorting=2&price=900000 HTTP/1.1" 410 None 2022-04-30 22:31:06,685 WARNING immoscrapy.immoscrapy Search returned 0 results.

Running your commandline tool I do got the following error. immoscrapy buy-house --country de --region chemnitz --price 100000-800000 --numberofrooms 5-

Namespace(city=None, command='buy-house', constructionyear=None, country='de', func=<function buy_house at 0x000001DCD6197790>, livingspace=None, numberofrooms='5-', price='100000-800000', region='chemnitz') 2022-04-30 22:35:00,686 INFO immoscrapy.immoscrapy Using URL: https://www.immobilienscout24.de/Suche/de/chemnitz/haus-kaufen?sorting=2&price=100000-800000&numberofrooms=5- 2022-04-30 22:35:00,689 DEBUG urllib3.connectionpool Starting new HTTPS connection (1): www.immobilienscout24.de:443 2022-04-30 22:35:00,951 DEBUG urllib3.connectionpool https://www.immobilienscout24.de:443 "POST /Suche/de/chemnitz/haus-kaufen?sorting=2&price=100000-800000&numberofrooms=5- HTTP/1.1" 410 None 2022-04-30 22:35:00,978 WARNING immoscrapy.immoscrapy Search returned 0 results. Traceback (most recent call last): File "c:\anaconda3\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\anaconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\anaconda3\Scripts\immoscrapy.exe__main__.py", line 7, in File "c:\anaconda3\lib\site-packages\immoscrapy\cli.py", line 18, in main args.func(args) File "c:\anaconda3\lib\site-packages\immoscrapy\cli.py", line 128, in buy_house pretty_print(results) File "c:\anaconda3\lib\site-packages\immoscrapy\cli.py", line 137, in pretty_print df.drop('id', axis=1, inplace=True) File "c:\anaconda3\lib\site-packages\pandas\core\frame.py", line 4308, in drop return super().drop( File "c:\anaconda3\lib\site-packages\pandas\core\generic.py", line 4153, in drop obj = obj._drop_axis(labels, axis, level=level, errors=errors) File "c:\anaconda3\lib\site-packages\pandas\core\generic.py", line 4188, in _drop_axis new_axis = axis.drop(labels, errors=errors) File "c:\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 5591, in drop raise KeyError(f"{labels[mask]} not found in axis") KeyError: "['id'] not found in axis"

Please let me know If can provide more information our can help to debug this issue. I'm still wonder that your post request works without any authorizing. Because on the immoscout24 API Developer Portal they say, that you need to apply for access key to use this Rest API.

venthur commented 2 years ago

I think i found the issue, thanks for the extra information. The problem was to pretty print empty results.

Lama09 commented 2 years ago

And I think I found the issue on my side. The crash in your cli tool came from the empty search results and the empty search results are due to a wrong search string from my side. The region is sachsen and not chemnitz. So using the following request the query returns results.

ult = immoscrapy.query('de','sachsen', 'chemnitz', 'HOUSE_BUY',price=900000)

I came along this misstake, as I coppied the complete url generated within immoscrapy.query to my webbrowser, where I than also recieved a response error. This also bought me to the idea to provide a query where the user can put the immoscout24 search string directly in to the function. So you can use the immoscout frontend to setup all your search criteria and use the url than within your function. Maybe a feature request?