Closed stygmate closed 1 year ago
@vladkens I don't have fully understand the annotation # if login changed, old login can be cached in rawContent, so use less strict check
but the actual code always produce RT with incorect syntax (when truncated) as rt_msg = f"{prefix}{rt.rawContent}"
with prefix = "RT @"
always throw things like : RT @the original message
.
for me the correct code seems to be:
--- a/twscrape/models.py (revision 745bc59b662b72e46b0f5277e369767f2b318c06)
+++ b/twscrape/models.py (date 1694682596100)
@@ -230,12 +230,8 @@
# issue #42 – restore full rt text
rt = doc.retweetedTweet
if rt is not None and rt.user is not None and doc.rawContent.endswith("…"):
- # prefix = f"RT @{rt.user.username}: "
- # if login changed, old login can be cached in rawContent, so use less strict check
- prefix = "RT @"
-
- rt_msg = f"{prefix}{rt.rawContent}"
- if doc.rawContent != rt_msg and doc.rawContent.startswith(prefix):
+ rt_msg = f"RT @{rt.user.username}: {rt.rawContent}"
+ if doc.rawContent != rt_msg:
doc.rawContent = rt_msg
return doc
Patch https://github.com/vladkens/twscrape/pull/76 merged in v0.9. Thanks @stygmate
sometime rawContent contain things like
RT @@handle @otherhandle [...] some text.
Maybe related to this code: https://github.com/vladkens/twscrape/blob/745bc59b662b72e46b0f5277e369767f2b318c06/twscrape/models.py#L237C48-L237C48