CartoDB / carto-python

CARTO Python client
https://carto.com
BSD 3-Clause "New" or "Revised" License
154 stars 62 forks source link

Unable to cc.read or cc.query #89

Closed MichaelSpichiger closed 6 years ago

MichaelSpichiger commented 6 years ago

Hello! I've been trying to do cc.read and cc.query using cartoframes. When I try and run it I get this error message:

<ipython-input-54-d98adbbbcfad> in <module>()
----> 1 assigned = cc.read('us_counties_assigned')
      2 unassigned = cc.read('us_counties_unassigned')

/Users/mspichiger/venv/lib/python2.7/site-packages/cartoframes/context.pyc in read(self, table_name, limit, index, decode_geom, shared_user)
    192                 raise ValueError("`limit` parameter must an integer >= 0")
    193 
--> 194         return self.query(query, decode_geom=decode_geom)
    195 
    196     def write(self, df, table_name, temp_dir=CACHE_DIR, overwrite=False,

/Users/mspichiger/venv/lib/python2.7/site-packages/cartoframes/context.pyc in query(self, query, table_name, decode_geom)
    728                 query,
    729                 skipfields='the_geom_webmercator',
--> 730                 **DEFAULT_SQL_ARGS)
    731             if 'error' in select_res:
    732                 raise CartoException(str(select_res['error']))

/Users/mspichiger/venv/lib/python2.7/site-packages/carto/sql.pyc in send(self, sql, parse_json, do_post, format, **request_args)
     84             return self.auth_client.get_response_data(resp, parse_json)
     85         except Exception as e:
---> 86             raise CartoException(e)
     87 
     88 

CartoException: Unterminated string starting at: line 1 column 1349219 (char 1349218)

The confusing part is that every time I run the command I get a new column value. In the example I gave the number is 1349219, but I also get 2981871 (char 2981870), 1266002 (char 1266001), and many more seemingly random columns. The dataset I'm trying to read is about 6.6 MB and has 18 columns. I've included the table here:

us_counties_assigned.csv.zip

Let me know if there's anything I can do to help.

MichaelSpichiger commented 6 years ago

Additionally, resetting the index, deleting the cartodb_id column, and writing the dataframe, appears to make it readable again.

michellemho commented 6 years ago

I am also running into this issue, and it's blocking my ability to complete analysis and share my jupyter notebook in https://github.com/CartoDB/research/issues/508. What's odd is that just a month ago, these queries were working. Something has changed since then, but we're not sure what. Something with limits?

Related issue: https://github.com/CartoDB/carto-python/issues/88

As @andy-esch pointed out, the carto python sdk equivalent is:

sql_client = SQLClient(...)
sql_client.send('select * from table')
michellemho commented 6 years ago

@danicarrion, any ideas on what could be causing this issue? We're finding that it's intermittent. Sometimes the request query will work, sometimes it won't.

orrholly commented 6 years ago

@danicarrion - I'm running into this issue as well.

andy-esch commented 6 years ago

@alrocar this is even showing up on reliable CI tests on cartoframes where previously this didn't happen at all: https://travis-ci.org/CartoDB/cartoframes/jobs/415581494#L727

zingbot commented 6 years ago

@ramiroaznar Is there an update on this issue? This is blocking a ton of work from being done at a crunch time here in the US. We have demos and sales opportunities being compromised by this blocker.

ramiroaznar commented 6 years ago

Adding this to the RT kanban (FYI @alrocar and @danicarrion will be back next week).

stuartlynn commented 6 years ago

Hey guys.

I think this is an issue not with cartoframes or carto-python but from the SQL API. It's intermittent which makes me think that its an issue in the caching layer and seems to have stated without any changes to either CARTOFrames or carto-python.

pramsey commented 6 years ago

The intermittent nature makes me suspect a time-based limit. For data sets of the right size, sometimes the work gets done before the axe falls, and sometimes they go over the limit and get squished.

stuartlynn commented 6 years ago

Have the limits changed then? Or how we deal with time outs? This seems to have been something that only started a few weeks ago

ramiroaznar commented 6 years ago

Yes, they have changed but timeout errors have been set for a while... let's see what @oleurud could find out...

simon-contreras-deel commented 6 years ago

I will need more info to give you the best help.

I think you are receiving a SQL timeout error https://carto.com/developers/sql-api/support/timeout-limiting due to your comment

In the example I gave the number is 1349219, but I also get 2981871 (char 2981870), 1266002 (char 1266001), and many more seemingly random columns

But the response error has no sense for me. The error

`'limit' parameter must an integer >= 0`

should be

You are over platform\'s limits: SQL query timeout error. Refactor your query before running again or contact CARTO support for more details.

So, in order to replicate the problematic request, I need to know the user (with access to the table us_counties_unassigned). I suppose it is in production.

pramsey commented 6 years ago

@rafatower the timing of this issue starting up is quite similar to the timing of your Python API work, though I don't see an obvious intersection between what you did and the path being exercised by cartoframes.

rafatower commented 6 years ago

This issue predates any development related to copy.

pramsey commented 6 years ago

How about changes to the varnish configuration in support of same?

simon-contreras-deel commented 6 years ago

Ei @MichaelSpichiger can you tell me what is the user? Or @michellemho, can you bring me more data about your case (user and table)?

michellemho commented 6 years ago

@oleurud, @MichaelSpichiger was our intern and his last day was yesterday.

Basically any large table is failing the query. In my account, the table lion_lines_full (this is a data set of streets in NYC) tends to fail. My user is michellemho-carto. Let me know if you need more information.

MichaelSpichiger commented 6 years ago

Hello, my user is mspichigercarto, the table target_1 is failing and it's not even that large. It's 15 rows by 12 columns. I also notice that other tables that are larger work fine and SOMETIMES reseting the index and dropping the cartodb_id works. It's not consistent though.

Hope this helps

simon-contreras-deel commented 6 years ago

I think, it is the same as https://github.com/CartoDB/support/issues/1703 and https://github.com/CartoDB/support/issues/1688

andy-esch commented 6 years ago

@oleurud, about the 'limit' parameter must an integer >= 0 error, that's just the line in the code that the stacktrace carries over from the source code, not the actual error. Python stack traces give you a couple of lines before where the error was thrown to give context. In this case, there is a try/except block before the return, but that block is not the source of the error.

The path is: cc.read -> cc.query -> carto python sdk sql client -> CartoException

simon-contreras-deel commented 6 years ago

Yes, I realized later! Thank you

ramiroaznar commented 6 years ago

Assigning to Infra, but the work will be done on the other related issues.

andy-esch commented 6 years ago

This should be fixed now: https://github.com/CartoDB/support/issues/1688