Closed jleroy closed 1 year ago
The issue seems to come from: https://github.com/micahflee/semiphemeral/blob/be913fda43388de9c6861fc93427c50557dfbcd1/semiphemeral/twitter.py#L385
...which has been introduced by commit https://github.com/micahflee/semiphemeral/commit/d1a1f6b0722eaefabbc0b2bc03c3de6c95bd1c8e.
According to SQLite documentation, this is because you can't have more than 999 SQL variables per query. This limit is hardcoded in the SQLITE_LIMIT_VARIABLE_NUMBER
constant, which can only be lowered at connection time, not increased.
The only solution here is to divide this query into smaller queries by splitting like_status_ids
into smaller lists:
like_status_ids = zip(*(iter(like_status_ids),) * 999)
Having a similar problem here.
$ semiphemeral unlike --filename like.js
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: too many SQL variables
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/semiphemeral", line 10, in <module>
sys.exit(main())
File "/usr/local/lib64/python3.7/site-packages/click/core.py", line 764, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib64/python3.7/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/usr/local/lib64/python3.7/site-packages/click/core.py", line 1137, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib64/python3.7/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib64/python3.7/site-packages/click/core.py", line 555, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/semiphemeral/__init__.py", line 75, in unlike
t.unlike(filename)
File "/usr/local/lib/python3.7/site-packages/semiphemeral/twitter.py", line 385, in unlike
for tweet in self.common.session.query(Tweet).filter(Tweet.status_id.in_(like_status_ids)).order_by(Tweet.created_at.desc()).all():
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3211, in all
return list(self)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3367, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3392, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 982, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) too many SQL variables
[SQL: SELECT tweets.id AS tweets_id, tweets.created_at AS tweets_created_at, tweets.user_id AS tweets_user_id, tweets.user_screen_name AS tweets_user_screen_name, tweets.status_id AS tweets_status_id, tweets.lang AS tweets_lang, tweets.source AS tweets_source, tweets.source_url AS tweets_source_url, tweets.text AS tweets_text, tweets.in_reply_to_screen_name AS tweets_in_reply_to_screen_name, tweets.in_reply_to_status_id AS tweets_in_reply_to_status_id, tweets.in_reply_to_user_id AS tweets_in_reply_to_user_id, tweets.retweet_count AS tweets_retweet_count, tweets.favorite_count AS tweets_favorite_count, tweets.retweeted AS tweets_retweeted, tweets.favorited AS tweets_favorited, tweets.is_retweet AS tweets_is_retweet, tweets.is_deleted AS tweets_is_deleted, tweets.is_unliked AS tweets_is_unliked, tweets.exclude_from_delete AS tweets_exclude_from_delete, tweets.thread_id AS tweets_thread_id
FROM tweets
[...]
I used this to get it to work.
def chunks(lst, n):
"""Yield successive n-sized chunks from lst.; helper for 'unlike'"""
for i in range(0, len(lst), n):
yield lst[i:i + n]
(and later in the unlike
function:)
loaded_status_ids = []
for i_like_ids in chunks(like_status_ids, 600):
for tweet in self.common.session.query(Tweet).filter(Tweet.status_id.in_(i_like_ids)).order_by(Tweet.created_at.desc()).all():
if tweet.created_at < datetime_threshold:
all_tweets.append(tweet)
loaded_status_ids.append(tweet.status_id)
Are you able to create a patch implementing this? I'd be happy to review and merge it. Thanks!