bryanyang0528 / ksql-python

A python wrapper for the KSQL REST API.
MIT License
159 stars 64 forks source link

KSQL Query function is timing out but working fine in cli and curl #59

Open mungujn opened 5 years ago

mungujn commented 5 years ago

When I make a ksql query using the ksql-cli and curl, It works fine. Making the same query using this library doesn't work. After investigating the issue, it looks like the problem is how python requests library handles the results https://stackoverflow.com/a/28156068/4395533. This quote from the stack overflow user larsks who answered the question summarises the problem

This behaviour is due to a buggy implementation of the iter_lines method in the requests library.

iter_lines iterates over the response content in chunk_size blocks of data using the iter_content > iterator. If there are less than chunk_size bytes of data available for reading from the remote > > server (which will typically be the case when reading the last line of output), the read operation > will block until chunk_size bytes of data are available

I made the same request in golang using the net/http package the behaviour was the same so one could argue that the implementation of the clients making the requests is correct and its ksql that is returning results in a sub-optimal way.

Of course, the query works fine in curl and the workaround we are using is to make a call to curl through pythons subprocess library and then pipe the data back to our application. I guess that is a case for the issue being with how this library is handling responses

bryanyang0528 commented 5 years ago

@mungujn Thank you for reporting this problem. I will figure out how I can fixed it problem. Can you give me some example of quires you used?

BenMacKenzie commented 5 years ago

i have the same problem. I was experimenting with the idea of training a model using data from a kafka topic. I wrote the data from the kaggle titanic data set into a kafka topic and created a stream from the confluent ksql editor. I then query the stream. works fine from cli but does not work from KSQLAPI unless I add a 'limit XX' to the query.

mungujn commented 5 years ago

@mungujn Thank you for reporting this problem. I will figure out how I can fix it. Can you give me some example of queries you used?

Every query I ran exhibited that behaviour e.g

SELECT * FROM USERS_TABLE;

Although I did not do limit xx

olmul commented 5 years ago

@bryanyang0528 I was experiencing this issue too. It seems that the requests library doesn't support streaming data back when using the POST method. I found that using urllib.requests works as a workaround, reading the response in line by line. I did a crude swap, probably needs to be cleaned up

bryanyang0528 commented 5 years ago

@olmul would you mind create a PR and share your work around solution?

romainr commented 5 years ago

Experiencing the same (blocking) with any query without 'LIMIT' on the 'query()' API, will try the PR.

jacobcrawford commented 2 years ago

Has this problem been fixed? I seem to experience the same problem when using the library.

KenCox94 commented 2 years ago

@jacobcrawford this may be resolved with an upcoming release. Please advise after the update.