CartoDB / cartoframes

CARTO Python package for data scientists
BSD 3-Clause "New" or "Revised" License
251 stars 63 forks source link

cc.data 'HTTPError: 414 Client Error: Request-URI Too Large for url' #778

Closed djfan closed 5 years ago

djfan commented 5 years ago

Running the following codes to do data enrichment

DO_features = [{"numer_id": "us.census.acs.B01003001", "normalization": "predenominated"},
               {"numer_id": "us.census.acs.B01001002", "normalization": "predenominated"}] 
dj.data('dis_qt', DO_features, persist_as='qt_enrich_pop2')
Error Message

``` /Users/dfan/src/carto/carto/sql.py:229: UserWarning: Batch SQL job created with job_id: 6d87de1d-5887-4552-a2b9-16111f704ab7 warnings.warn('Batch SQL job created with job_id: {job_id}'.format(job_id=data['job_id'])) --------------------------------------------------------------------------- HTTPError Traceback (most recent call last) ~/src/carto/carto/sql.py in copyto(self, query) 452 stream=True) --> 453 response.raise_for_status() 454 except CartoRateLimitException as e: ~/anaconda3/lib/python3.6/site-packages/requests/models.py in raise_for_status(self) 939 if http_error_msg: --> 940 raise HTTPError(http_error_msg, response=self) 941 HTTPError: 414 Client Error: Request-URI Too Large for url: https://dfan.carto.com/api/v2/sql/copyto?api_key=fe760da2e615ed7b6fb527939757107aefe0dad1&q=COPY+%28SELECT+qt%2C+the_geom%2C+tag%2C+cartodb_id%2C+total_pop_2006_2010%2C+total_pop_2010_2014%2C+total_pop_2011_2015%2C+male_pop_2006_2010%2C+male_pop_2010_2014%2C+male_pop_2011_2015+FROM+%28SELECT+t.qt%2Ct.the_geom%2Ct.tag%2Ct.cartodb_id%2C+%28data-%3E0-%3E%3E%27value%27%29%3A%3ANumeric+AS+total_pop_2006_2010%2C+%28data-%3E1-%3E%3E%27value%27%29%3A%3ANumeric+AS+total_pop_2010_2014%2C+%28data-%3E2-%3E%3E%27value%27%29%3A%3ANumeric+AS+total_pop_2011_2015%2C+%28data-%3E3-%3E%3E%27value%27%29%3A%3ANumeric+AS+male_pop_2006_2010%2C+%28data-%3E4-%3E%3E%27value%27%29%3A%3ANumeric+AS+male_pop_2010_2014%2C+%28data-%3E5-%3E%3E%27value%27%29%3A%3ANumeric+AS+male_pop_2011_2015%0AFROM+OBS_GetData%28%0A%28SELECT+array_agg%28%28the_geom%2C+cartodb_id%29%3A%3Ageomval%29%0AFROM+%22dis_qt%22%29%2C%0A%28SELECT+%27%5B%7B%22denom_aggregate%22%3Anull%2C%22denom_colname%22%3Anull%2C%22denom_geomref_colname%22%3Anull%2C%22denom_id%22%3Anull%2C%22denom_name%22%3Anull%2C%22denom_reltype%22%3Anull%2C%22denom_tablename%22%3Anull%2C%22denom_type%22%3Anull%2C%22geom_colname%22%3A%22the_geom%22%2C%22geom_geomref_colname%22%3A%22geoid%22%2C%22geom_id%22%3A%22us.census.tiger.block_group%22%2C%22geom_name%22%3A%22US+Census+Block+Groups%22%2C%22geom_tablename%22%3A%22obs_78fb6c1d6ff6505225175922c2c389ce48d7632c%22%2C%22geom_timespan%22%3A%222015%22%2C%22geom_type%22%3A%22Geometry%22%2C%22id%22%3A1.0%2C%22max_score_rank%22%3Anull%2C%22max_timespan_rank%22%3Anull%2C%22normalization%22%3A%22predenominated%22%2C%22num_geoms%22%3A214156.8157292008%2C%22numer_aggregate%22%3A%22sum%22%2C%22numer_colname%22%3A%22total_pop%22%2C%22numer_geomref_colname%22%3A%22geoidsl%22%2C%22numer_id%22%3A%22us.census.acs.B01003001%22%2C%22numer_name%22%3A%22Total+Population%22%2C%22numer_tablename%22%3A%22obs_904e770b77c9b3dc25ab96b0c47405c757617d86%22%2C%22numer_timespan%22%3A%222006+-+2010%22%2C%22numer_type%22%3A%22Numeric%22%2C%22score%22%3A30.0859785631%2C%22score_rank%22%3A1.0%2C%22score_rownum%22%3A1.0%2C%22suggested_name%22%3A%22total_pop_2006_2010%22%2C%22target_area%22%3Anull%2C%22target_geoms%22%3Anull%2C%22timespan_rank%22%3A6.0%2C%22timespan_rownum%22%3A3.0%7D%2C%7B%22denom_aggregate%22%3Anull%2C%22denom_colname%22%3Anull%2C%22denom_geomref_colname%22%3Anull%2C%22denom_id%22%3Anull%2C%22denom_name%22%3Anull%2C%22denom_reltype%22%3Anull%2C%22denom_tablename%22%3Anull%2C%22denom_type%22%3Anull%2C%22geom_colname%22%3A%22the_geom%22%2C%22geom_geomref_colname%22%3A%22geoid%22%2C%22geom_id%22%3A%22us.census.tiger.block_group%22%2C%22geom_name%22%3A%22US+Census+Block+Groups%22%2C%22geom_tablename%22%3A%22obs_78fb6c1d6ff6505225175922c2c389ce48d7632c%22%2C%22geom_timespan%22%3A%222015%22%2C%22geom_type%22%3A%22Geometry%22%2C%22id%22%3A1.0%2C%22max_score_rank%22%3Anull%2C%22max_timespan_rank%22%3Anull%2C%22normalization%22%3A%22predenominated%22%2C%22num_geoms%22%3A214156.8157292008%2C%22numer_aggregate%22%3A%22sum%22%2C%22numer_colname%22%3A%22total_pop%22%2C%22numer_geomref_colname%22%3A%22geoidsl%22%2C%22numer_id%22%3A%22us.census.acs.B01003001%22%2C%22numer_name%22%3A%22Total+Population%22%2C%22numer_tablename%22%3A%22obs_1a098da56badf5f32e336002b0a81708c40d29cd%22%2C%22numer_timespan%22%3A%222010+-+2014%22%2C%22numer_type%22%3A%22Numeric%22%2C%22score%22%3A30.0859785631%2C%22score_rank%22%3A1.0%2C%22score_rownum%22%3A1.0%2C%22suggested_name%22%3A%22total_pop_2010_2014%22%2C%22target_area%22%3Anull%2C%22target_geoms%22%3Anull%2C%22timespan_rank%22%3A4.0%2C%22timespan_rownum%22%3A2.0%7D%2C%7B%22denom_aggregate%22%3Anull%2C%22denom_colname%22%3Anull%2C%22denom_geomref_colname%22%3Anull%2C%22denom_id%22%3Anull%2C%22denom_name%22%3Anull%2C%22denom_reltype%22%3Anull%2C%22denom_tablename%22%3Anull%2C%22denom_type%22%3Anull%2C%22geom_colname%22%3A%22the_geom%22%2C%22geom_geomref_colname%22%3A%22geoid%22%2C%22geom_id%22%3A%22us.census.tiger.block_group%22%2C%22geom_name%22%3A%22US+Census+Block+Groups%22%2C%22geom_tablename%22%3A%22obs_78fb6c1d6ff6505225175922c2c389ce48d7632c%22%2C%22geom_timespan%22%3A%222015%22%2C%22geom_type%22%3A%22Geometry%22%2C%22id%22%3A1.0%2C%22max_score_rank%22%3Anull%2C%22max_timespan_rank%22%3Anull%2C%22normalization%22%3A%22predenominated%22%2C%22num_geoms%22%3A214156.8157292008%2C%22numer_aggregate%22%3A%22sum%22%2C%22numer_colname%22%3A%22total_pop%22%2C%22numer_geomref_colname%22%3A%22geoidsl%22%2C%22numer_id%22%3A%22us.census.acs.B01003001%22%2C%22numer_name%22%3A%22Total+Population%22%2C%22numer_tablename%22%3A%22obs_209d3476ef8eaaa18e597cabcf1bdb627f37aa5e%22%2C%22numer_timespan%22%3A%222011+-+2015%22%2C%22numer_type%22%3A%22Numeric%22%2C%22score%22%3A30.0859785631%2C%22score_rank%22%3A1.0%2C%22score_rownum%22%3A1.0%2C%22suggested_name%22%3A%22total_pop_2011_2015%22%2C%22target_area%22%3Anull%2C%22target_geoms%22%3Anull%2C%22timespan_rank%22%3A3.0%2C%22timespan_rownum%22%3A1.0%7D%2C%7B%22denom_aggregate%22%3A%22sum%22%2C%22denom_colname%22%3A%22total_pop%22%2C%22denom_geomref_colname%22%3A%22geoidsl%22%2C%22denom_id%22%3A%22us.census.acs.B01003001%22%2C%22denom_name%22%3A%22Total+Population%22%2C%22denom_reltype%22%3A%22denominator%22%2C%22denom_tablename%22%3A%22obs_904e770b77c9b3dc25ab96b0c47405c757617d86%22%2C%22denom_type%22%3A%22Numeric%22%2C%22geom_colname%22%3A%22the_geom%22%2C%22geom_geomref_colname%22%3A%22geoid%22%2C%22geom_id%22%3A%22us.census.tiger.block_group%22%2C%22geom_name%22%3A%22US+Census+Block+Groups%22%2C%22geom_tablename%22%3A%22obs_78fb6c1d6ff6505225175922c2c389ce48d7632c%22%2C%22geom_timespan%22%3A%222015%22%2C%22geom_type%22%3A%22Geometry%22%2C%22id%22%3A2.0%2C%22max_score_rank%22%3Anull%2C%22max_timespan_rank%22%3Anull%2C%22normalization%22%3A%22predenominated%22%2C%22num_geoms%22%3A214156.8157292008%2C%22numer_aggregate%22%3A%22sum%22%2C%22numer_colname%22%3A%22male_pop%22%2C%22numer_geomref_colname%22%3A%22geoidsl%22%2C%22numer_id%22%3A%22us.census.acs.B01001002%22%2C%22numer_name%22%3A%22Male+Population%22%2C%22numer_tablename%22%3A%22obs_904e770b77c9b3dc25ab96b0c47405c757617d86%22%2C%22numer_timespan%22%3A%222006+-+2010%22%2C%22numer_type%22%3A%22Numeric%22%2C%22score%22%3A30.0859785631%2C%22score_rank%22%3A1.0%2C%22score_rownum%22%3A1.0%2C%22suggested_name%22%3A%22male_pop_2006_2010%22%2C%22target_area%22%3Anull%2C%22target_geoms%22%3Anull%2C%22timespan_rank%22%3A6.0%2C%22timespan_rownum%22%3A3.0%7D%2C%7B%22denom_aggregate%22%3A%22sum%22%2C%22denom_colname%22%3A%22total_pop%22%2C%22denom_geomref_colname%22%3A%22geoidsl%22%2C%22denom_id%22%3A%22us.census.acs.B01003001%22%2C%22denom_name%22%3A%22Total+Population%22%2C%22denom_reltype%22%3A%22denominator%22%2C%22denom_tablename%22%3A%22obs_1a098da56badf5f32e336002b0a81708c40d29cd%22%2C%22denom_type%22%3A%22Numeric%22%2C%22geom_colname%22%3A%22the_geom%22%2C%22geom_geomref_colname%22%3A%22geoid%22%2C%22geom_id%22%3A%22us.census.tiger.block_group%22%2C%22geom_name%22%3A%22US+Census+Block+Groups%22%2C%22geom_tablename%22%3A%22obs_78fb6c1d6ff6505225175922c2c389ce48d7632c%22%2C%22geom_timespan%22%3A%222015%22%2C%22geom_type%22%3A%22Geometry%22%2C%22id%22%3A2.0%2C%22max_score_rank%22%3Anull%2C%22max_timespan_rank%22%3Anull%2C%22normalization%22%3A%22predenominated%22%2C%22num_geoms%22%3A214156.8157292008%2C%22numer_aggregate%22%3A%22sum%22%2C%22numer_colname%22%3A%22male_pop%22%2C%22numer_geomref_colname%22%3A%22geoidsl%22%2C%22numer_id%22%3A%22us.census.acs.B01001002%22%2C%22numer_name%22%3A%22Male+Population%22%2C%22numer_tablename%22%3A%22obs_1a098da56badf5f32e336002b0a81708c40d29cd%22%2C%22numer_timespan%22%3A%222010+-+2014%22%2C%22numer_type%22%3A%22Numeric%22%2C%22score%22%3A30.0859785631%2C%22score_rank%22%3A1.0%2C%22score_rownum%22%3A1.0%2C%22suggested_name%22%3A%22male_pop_2010_2014%22%2C%22target_area%22%3Anull%2C%22target_geoms%22%3Anull%2C%22timespan_rank%22%3A4.0%2C%22timespan_rownum%22%3A2.0%7D%2C%7B%22denom_aggregate%22%3A%22sum%22%2C%22denom_colname%22%3A%22total_pop%22%2C%22denom_geomref_colname%22%3A%22geoidsl%22%2C%22denom_id%22%3A%22us.census.acs.B01003001%22%2C%22denom_name%22%3A%22Total+Population%22%2C%22denom_reltype%22%3A%22denominator%22%2C%22denom_tablename%22%3A%22obs_209d3476ef8eaaa18e597cabcf1bdb627f37aa5e%22%2C%22denom_type%22%3A%22Numeric%22%2C%22geom_colname%22%3A%22the_geom%22%2C%22geom_geomref_colname%22%3A%22geoid%22%2C%22geom_id%22%3A%22us.census.tiger.block_group%22%2C%22geom_name%22%3A%22US+Census+Block+Groups%22%2C%22geom_tablename%22%3A%22obs_78fb6c1d6ff6505225175922c2c389ce48d7632c%22%2C%22geom_timespan%22%3A%222015%22%2C%22geom_type%22%3A%22Geometry%22%2C%22id%22%3A2.0%2C%22max_score_rank%22%3Anull%2C%22max_timespan_rank%22%3Anull%2C%22normalization%22%3A%22predenominated%22%2C%22num_geoms%22%3A214156.8157292008%2C%22numer_aggregate%22%3A%22sum%22%2C%22numer_colname%22%3A%22male_pop%22%2C%22numer_geomref_colname%22%3A%22geoidsl%22%2C%22numer_id%22%3A%22us.census.acs.B01001002%22%2C%22numer_name%22%3A%22Male+Population%22%2C%22numer_tablename%22%3A%22obs_209d3476ef8eaaa18e597cabcf1bdb627f37aa5e%22%2C%22numer_timespan%22%3A%222011+-+2015%22%2C%22numer_type%22%3A%22Numeric%22%2C%22score%22%3A30.0859785631%2C%22score_rank%22%3A1.0%2C%22score_rownum%22%3A1.0%2C%22suggested_name%22%3A%22male_pop_2011_2015%22%2C%22target_area%22%3Anull%2C%22target_geoms%22%3Anull%2C%22timespan_rank%22%3A3.0%2C%22timespan_rownum%22%3A1.0%7D%5D%27%3A%3Ajson%29%29+as+m%2C%0Adis_qt+as+t%0AWHERE+t.%22cartodb_id%22+%3D+m.id%29+_q%29+TO+stdout+WITH+%28FORMAT+csv%2C+HEADER+true%29&client=cartoframes_1.0.0b1-dev During handling of the above exception, another exception occurred: JSONDecodeError Traceback (most recent call last) in 5 DO_features = [{"numer_id": "us.census.acs.B01003001", "normalization": "predenominated"}, 6 {"numer_id": "us.census.acs.B01001002", "normalization": "predenominated"}] ----> 7 dj.data('dis_qt', DO_features, persist_as='qt_enrich_pop2') ~/src/cartoframes/cartoframes/context.py in data(self, table_name, metadata, persist_as, how) 1639 meta=_meta.to_json(orient='records').replace('\'', '\'\'')) 1640 -> 1641 return self.query(query, table_name=persist_as, decode_geom=False, is_select=True) 1642 1643 def _auth_send(self, relative_path, http_method, **kwargs): ~/src/cartoframes/cartoframes/context.py in query(self, query, table_name, decode_geom, is_select) 699 dataset = Dataset.from_query(query=query, context=self) 700 dataset.upload(table_name=table_name) --> 701 dataframe = dataset.download(decode_geom=decode_geom) 702 else: 703 dataframe = self.fetch(query, decode_geom=decode_geom) ~/src/cartoframes/cartoframes/data/dataset.py in download(self, limit, decode_geom, retry_times) 190 table_columns = self.get_table_columns() 191 query = self._get_read_query(table_columns, limit) --> 192 self._df = self._cc.fetch(query, decode_geom=decode_geom) 193 return self._df 194 ~/src/cartoframes/cartoframes/context.py in fetch(self, query, decode_geom) 515 """ 516 copy_query = 'COPY ({query}) TO stdout WITH (FORMAT csv, HEADER true)'.format(query=query) --> 517 result = recursive_read(self, copy_query) 518 519 query_columns = get_columns(self, query) ~/src/cartoframes/cartoframes/data/utils.py in recursive_read(context, query, retry_times) 136 def recursive_read(context, query, retry_times=DEFAULT_RETRY_TIMES): 137 try: --> 138 return context.copy_client.copyto_stream(query) 139 except CartoRateLimitException as err: 140 if retry_times > 0: ~/src/carto/carto/sql.py in copyto_stream(self, query) 519 :raise: CartoException 520 """ --> 521 return ResponseStream(self.copyto(query)) ~/src/carto/carto/sql.py in copyto(self, query) 457 if 400 <= response.status_code < 500: 458 # Client error, provide better reason --> 459 reason = response.json()['error'][0] 460 error_msg = u'%s Client Error: %s' % (response.status_code, 461 reason) ~/anaconda3/lib/python3.6/site-packages/requests/models.py in json(self, **kwargs) 895 # used. 896 pass --> 897 return complexjson.loads(self.text, **kwargs) 898 899 @property ~/anaconda3/lib/python3.6/site-packages/simplejson/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, use_decimal, **kw) 516 parse_constant is None and object_pairs_hook is None 517 and not use_decimal and not kw): --> 518 return _default_decoder.decode(s) 519 if cls is None: 520 cls = JSONDecoder ~/anaconda3/lib/python3.6/site-packages/simplejson/decoder.py in decode(self, s, _w, _PY3) 368 if _PY3 and isinstance(s, binary_type): 369 s = s.decode(self.encoding) --> 370 obj, end = self.raw_decode(s) 371 end = _w(s, end).end() 372 if end != len(s): ~/anaconda3/lib/python3.6/site-packages/simplejson/decoder.py in raw_decode(self, s, idx, _w, _PY3) 398 elif ord0 == 0xef and s[idx:idx + 3] == '\xef\xbb\xbf': 399 idx += 3 --> 400 return self.scan_once(s, idx=_w(s, idx).end()) JSONDecodeError: Expecting value: line 1 column 1 (char 0) ```

It works if I add two features seperatly.

Jesus89 commented 5 years ago

@oleurud @alrocar

alrocar commented 5 years ago

We recently added POST support to COPY TO queries, that might solve this issue.

andy-esch commented 5 years ago

I had the same issue when DO SQL queries went over the SQL API and moving the payload to the body of the POST solved the issue

simon-contreras-deel commented 5 years ago

I have been checking the case and with the beta2 is not happening if you set the persist_as (in this case, the download will use the table name instead of the query).

In any case, I am going to add the POST support to the Dataset download method and DataObsClient augment

simon-contreras-deel commented 5 years ago

The solution will be using 'GET' by default and using POST if the query is longer than 1024 bytes. It will be done in the copyto method of carto-python

simon-contreras-deel commented 5 years ago

As I said before, using persist_as should work fine in beta2. Without the table name, it should be resolved using the dev branch