datajoint / datajoint-python

Relational data pipelines for the science lab
https://datajoint.com/docs
GNU Lesser General Public License v2.1
169 stars 84 forks source link

Error with `.aggr()` in datajoint 0.13.X #944

Open kabilar opened 3 years ago

kabilar commented 3 years ago

Bug Report

Description

When using the ... argument in table1.aggr(table2, ...) to retain all attributes from table1 the following error occurs: OperationalError: (1055, "'neuro_scan.scan.scanner' isn't in GROUP BY")

Reproducibility

In [4]: session=dj.create_virtual_module('neuro_session','neuro_session')

In [5]: scan=dj.create_virtual_module('neuro_scan','neuro_scan')

In [6]: scan.Scan()
Out[6]: 
*subject       *session_datet *scan_id    scanner        acq_software   scan_notes    
+------------+ +------------+ +---------+ +------------+ +------------+ +------------+
test_scanimag 2020-06-09 17: 0           MINI2P_Imaging ScanImage                    
 (Total: 1)

In [7]: session.Session()
Out[7]: 
*subject       *session_datet
+------------+ +------------+
test_scanimag 2020-06-09 17:

In [8]: session.Session.aggr(scan.Scan)
Out[8]: 
*subject       *session_datet
+------------+ +------------+
test_scanimag 2020-06-09 17:
 (Total: 1)

In [9]: session.Session.aggr(scan.Scan,...)
Out[9]: 
*subject       *session_datet
+------------+ +------------+
test_scanimag 2020-06-09 17:
 (Total: 1)

In [10]: scan.Scan.aggr(session.Session)
Out[10]: 
*subject       *session_datet *scan_id   
+------------+ +------------+ +---------+
test_scanimag 2020-06-09 17: 0          
 (Total: 1)

In [11]: scan.Scan.aggr(session.Session,...)
Out[11]: ---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    392                         if cls is not object \
    393                                 and callable(cls.__dict__.get('__repr__')):
--> 394                             return _repr_pprint(obj, self, cycle)
    395 
    396             return _default_pprint(obj, self, cycle)

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    698     """A pprint that just redirects to the normal repr function."""
    699     # Find newlines and replace them with p.break_()
--> 700     output = repr(obj)
    701     lines = output.splitlines()
    702     with p.group():

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/expression.py in __repr__(self)
    527         :rtype: str
    528         """
--> 529         return super().__repr__() if config['loglevel'].lower() == 'debug' else self.preview()
    530 
    531     def preview(self, limit=None, width=None):

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/expression.py in preview(self, limit, width)
    531     def preview(self, limit=None, width=None):
    532         """ :return: a string of preview of the contents of the query. """
--> 533         return preview(self, limit, width)
    534 
    535     def _repr_html_(self):

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/preview.py in preview(query_expression, limit, width)
     11     if width is None:
     12         width = config['display.width']
---> 13     tuples = rel.fetch(limit=limit + 1, format="array")
     14     has_more = len(tuples) > limit
     15     tuples = tuples[:limit]

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/fetch.py in __call__(self, offset, limit, order_by, format, as_dict, squeeze, download_path, *attrs)
    191         else:  # fetch all attributes as a numpy.record_array or pandas.DataFrame
    192             cur = self._expression.cursor(
--> 193                 as_dict=as_dict, limit=limit, offset=offset, order_by=order_by)
    194             heading = self._expression.heading
    195             if as_dict:

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/expression.py in cursor(self, offset, limit, order_by, as_dict)
    517             sql += ' LIMIT %d' % limit + (' OFFSET %d' % offset if offset else "")
    518         logger.debug(sql)
--> 519         return self.connection.query(sql, as_dict=as_dict)
    520 
    521     def __repr__(self):

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/connection.py in query(self, query, args, as_dict, suppress_warnings, reconnect)
    300         cursor = self._conn.cursor(cursor=cursor_class)
    301         try:
--> 302             self._execute_query(cursor, query, args, suppress_warnings)
    303         except errors.LostConnectionError:
    304             if not reconnect:

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/connection.py in _execute_query(cursor, query, args, suppress_warnings)
    266                 cursor.execute(query, args)
    267         except client.err.Error as err:
--> 268             raise translate_query_error(err, query)
    269 
    270     def query(self, query, args=(), *, as_dict=False, suppress_warnings=True, reconnect=None):

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/datajoint/connection.py in _execute_query(cursor, query, args, suppress_warnings)
    264                     # suppress all warnings arising from underlying SQL library
    265                     warnings.simplefilter("ignore")
--> 266                 cursor.execute(query, args)
    267         except client.err.Error as err:
    268             raise translate_query_error(err, query)

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/cursors.py in execute(self, query, args)
    146         query = self.mogrify(query, args)
    147 
--> 148         result = self._query(query)
    149         self._executed = query
    150         return result

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/cursors.py in _query(self, q)
    308         self._last_executed = q
    309         self._clear_result()
--> 310         conn.query(q)
    311         self._do_get_result()
    312         return self.rowcount

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/connections.py in query(self, sql, unbuffered)
    546             sql = sql.encode(self.encoding, "surrogateescape")
    547         self._execute_command(COMMAND.COM_QUERY, sql)
--> 548         self._affected_rows = self._read_query_result(unbuffered=unbuffered)
    549         return self._affected_rows
    550 

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/connections.py in _read_query_result(self, unbuffered)
    773         else:
    774             result = MySQLResult(self)
--> 775             result.read()
    776         self._result = result
    777         if result.server_status is not None:

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/connections.py in read(self)
   1154     def read(self):
   1155         try:
-> 1156             first_packet = self.connection._read_packet()
   1157 
   1158             if first_packet.is_ok_packet():

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/connections.py in _read_packet(self, packet_type)
    723             if self._result is not None and self._result.unbuffered_active is True:
    724                 self._result.unbuffered_active = False
--> 725             packet.raise_for_error()
    726         return packet
    727 

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/protocol.py in raise_for_error(self)
    219         if DEBUG:
    220             print("errno =", errno)
--> 221         err.raise_mysql_exception(self._data)
    222 
    223     def dump(self):

/opt/miniconda3/envs/workflow-calcium-imaging/lib/python3.7/site-packages/pymysql/err.py in raise_mysql_exception(data)
    141     if errorclass is None:
    142         errorclass = InternalError if errno < 1000 else OperationalError
--> 143     raise errorclass(errno, errval)

OperationalError: (1055, "'neuro_scan.scan.scanner' isn't in GROUP BY")

Expected Behavior

From my understanding, the ... argument used in table1.aggr(table2, ...) should retain all attributes from table1.

Thank you!

kabilar commented 3 years ago

@ttngu207

jverswijver commented 2 years ago

The fix for this is likely the same as #916

jverswijver commented 2 years ago

@kabilar as it turns out this is a MariaDB specific error. MariaDB's implementation of the sql mode ONLY_FULL_GROUP_BY is incorrect when it comes to select on fields that are functionally dependent, it wants those functionally dependent fields to also be in the group by which is unnecessary from a database perspective. MySQL does not do that with the ONLY_FULL_GROUP_BY mode enabled which is correct. The next step would be to confirm if this error persists in the current version of MariaDB.

jverswijver commented 2 years ago

It seems the MariaDB devs have known about this for a while but has not fixed it yet, https://jira.mariadb.org/browse/MDEV-11588