Closed oubiwann closed 10 months ago
I made a quick hack to get around this, and then got another error:
Traceback (most recent call last):
File "/usr/local/bin/slack-to-discord", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.11/site-packages/slack_to_discord/__main__.py", line 27, in main
run_import(
File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 569, in run_import
raise client._exception
File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 412, in on_ready
await self._run_import(g)
File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 480, in _run_import
for msg in slack_channel_messages(self._data_dir, chan_name, self._users, emoji_map, pins):
File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 176, in slack_channel_messages
text = d["text"]
~^^^^^^^^
KeyError: 'text'
After a second work-around, the import is back up and running. Here's the diff for the hacks:
diff --git a/slack_to_discord/importer.py b/slack_to_discord/importer.py
index b83dc74..80acea1 100644
--- a/slack_to_discord/importer.py
+++ b/slack_to_discord/importer.py
@@ -142,6 +142,10 @@ def slack_filedata(f):
}
+def ts_fun(x):
+ if "ts" in x:
+ return x["ts"]
+
def slack_channel_messages(d, channel_name, users, emoji_map, pins):
def mention_repl(m):
type_ = m.group(1)
@@ -168,7 +172,9 @@ def slack_channel_messages(d, channel_name, users, emoji_map, pins):
for file in sorted(glob.glob(os.path.join(channel_dir, "*.json"))):
with open(file, "rb") as fp:
data = json.load(fp)
- for d in sorted(data, key=lambda x: x["ts"]):
+ for d in sorted(data, key=ts_fun):
+ if not "text" in d:
+ continue
text = d["text"]
text = MENTION_RE.sub(mention_repl, text)
text = LINK_RE.sub(lambda x: x.group(1), text)
So far so good ... the importer is powering though YEARS of Slack messages (still). Haven't had to restart since the above hack was put in place.
Import completed successfully with ~30k messages.
Glad everything worked for you!
Would it be possible for you to post or privately email me some (optionally redacted) JSON messages that don't have the ts
/text
fields in them? Since the ts
is a timestamp I'm really curious about what sort of message wouldn't have it. For text
, I've definitely seen events not have any, but usually it's just been an empty string/null
, not just be missing the field entirely.
Essentially I'd like to be able to understand why these cases exist in order to figure out the best way of dealing with them.
Yeah, that's the right way to do it.
I'll invert the logic, add a short-circuit, and then re-run: it should spit out the culprits pretty quickly.
I'll paste/attach sanitised data here ...
Nice job, btw -- this project's code is far cleaner (and thus not off-putting to tweak) than other projects for doing similar things.
Were you ever able to re-run with more logging enabled? In case it was a matter of not wanting to deal with duplicate messages being imported or figuring out how to stub the code out, here's a simple patch that will just log out the offending messages without importing anything (applies to v1.1.5 - the latest release).
It can be automatically applied by saving it to a file, then running patch <path to slack_to_discord/importer.py> <path to patch>
in the terminal.
diff --git a/slack_to_discord/importer.py b/slack_to_discord/importer.py
index b83dc74..c6a73c9 100644
--- a/slack_to_discord/importer.py
+++ b/slack_to_discord/importer.py
@@ -168,6 +168,9 @@ def mention_repl(m):
for file in sorted(glob.glob(os.path.join(channel_dir, "*.json"))):
with open(file, "rb") as fp:
data = json.load(fp)
+ for x in data:
+ if "ts" not in x or "text" not in x:
+ print(x) # print out problematic message
for d in sorted(data, key=lambda x: x["ts"]):
text = d["text"]
text = MENTION_RE.sub(mention_repl, text)
@@ -260,6 +263,7 @@ def mention_repl(m):
else:
messages[ts] = msg
+ return # prevent actually importing anything
# Sort the dicts by timestamp and yield the messages
for msg in (messages[x] for x in sorted(messages.keys())):
msg["replies"] = [msg["replies"][x] for x in sorted(msg["replies"].keys())]
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/slack_to_discord/importer.py", line 408, in on_ready
await self._run_import(g)
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/slack_to_discord/importer.py", line 476, in _run_import
for msg in slack_channel_messages(self._data_dir, chan_name, self._users, emoji_map, pins):
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/slack_to_discord/importer.py", line 171, in slack_channel_messages
for d in sorted(data, key=lambda x: x["ts"]):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.12/lib/python3.12/site-packages/slack_to_discord/importer.py", line 171, in <lambda>
for d in sorted(data, key=lambda x: x["ts"]):
~^^^^^^
KeyError: 'ts'
any solution to this @oubiwann @pR0Ps @maur1th
any solution to this
@syedzainqadri if you could apply the patch I posted above and report what it prints it would allow me to figure out why the issue is happening and how to fix it.
Alternatively, if you send me the export that's having issues (feel free to email it if it's sensitive, check my profile) I can figure it out from there.
Hi, I'm encountering the same error. Would be happy to email you the files in question.
here's the output
2024-01-17 23:05:26 INFO slack_to_discord.importer Processing channel '#techtalk'...
2024-01-17 23:05:26 CRITICAL slack_to_discord.importer Failed to finish import!
Traceback (most recent call last):
File "/Users/damon/.pyenv/versions/3.9.1/lib/python3.9/site-packages/slack_to_discord/importer.py", line 408, in on_ready
await self._run_import(g)
File "/Users/damon/.pyenv/versions/3.9.1/lib/python3.9/site-packages/slack_to_discord/importer.py", line 476, in _run_import
for msg in slack_channel_messages(self._data_dir, chan_name, self._users, emoji_map, pins):
File "/Users/damon/.pyenv/versions/3.9.1/lib/python3.9/site-packages/slack_to_discord/importer.py", line 171, in slack_channel_messages
for d in sorted(data, key=lambda x: x["ts"]):
File "/Users/damon/.pyenv/versions/3.9.1/lib/python3.9/site-packages/slack_to_discord/importer.py", line 171, in <lambda>
for d in sorted(data, key=lambda x: x["ts"]):
KeyError: 'ts'
2024-01-17 23:05:26 INFO slack_to_discord.importer Bot logging out
I'm using v 1.1.5 via pip. I applied the patch from 8/4/23, but still throws the error
I made a quick hack to get around this, and then got another error:
Traceback (most recent call last): File "/usr/local/bin/slack-to-discord", line 8, in <module> sys.exit(main()) ^^^^^^ File "/usr/local/lib/python3.11/site-packages/slack_to_discord/__main__.py", line 27, in main run_import( File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 569, in run_import raise client._exception File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 412, in on_ready await self._run_import(g) File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 480, in _run_import for msg in slack_channel_messages(self._data_dir, chan_name, self._users, emoji_map, pins): File "/usr/local/lib/python3.11/site-packages/slack_to_discord/importer.py", line 176, in slack_channel_messages text = d["text"] ~^^^^^^^^ KeyError: 'text'
After a second work-around, the import is back up and running. Here's the diff for the hacks:
diff --git a/slack_to_discord/importer.py b/slack_to_discord/importer.py index b83dc74..80acea1 100644 --- a/slack_to_discord/importer.py +++ b/slack_to_discord/importer.py @@ -142,6 +142,10 @@ def slack_filedata(f): } +def ts_fun(x): + if "ts" in x: + return x["ts"] + def slack_channel_messages(d, channel_name, users, emoji_map, pins): def mention_repl(m): type_ = m.group(1) @@ -168,7 +172,9 @@ def slack_channel_messages(d, channel_name, users, emoji_map, pins): for file in sorted(glob.glob(os.path.join(channel_dir, "*.json"))): with open(file, "rb") as fp: data = json.load(fp) - for d in sorted(data, key=lambda x: x["ts"]): + for d in sorted(data, key=ts_fun): + if not "text" in d: + continue text = d["text"] text = MENTION_RE.sub(mention_repl, text) text = LINK_RE.sub(lambda x: x.group(1), text)
I ended up using this workaround and it fixed the problem.
@damonseeley Thanks for sending me some sample data. The latest version (v1.1.6) has the fix in it.
Error message: