Closed russss closed 3 years ago
This works beautifully, thanks @russss .
To get your twitter archive zip go here: https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive
Thanks for this PR. With this branch I was able to successfully import all of my tweets in, even those that radiergummi
wasn't able to delete. semiphemeral delete
(master
branch) is now working its way through removing all those older tweets.
This adds more evidence for the theory that if you have a large 'hole' in your tweets, Twitter's API is not able to see past that hole to older tweets that may still be undeleted, meaneing semiphemeral can't access them without this kind of import.
Tried this to import from archive, but got an error: File "/home/debian/.local/lib/python3.7/site-packages/semiphemeral/db.py", line 52, in init self.created_at = status.created_at AttributeError: 'Status' object has no attribute 'created_at'
Tried this to import from archive, but got an error: File "/home/debian/.local/lib/python3.7/site-packages/semiphemeral/db.py", line 52, in init self.created_at = status.created_at AttributeError: 'Status' object has no attribute 'created_at'
I just tried importing from archive and got the same error. It looks like an issue with parsing the JSON object for each tweet. In my tweet.js
file, each object in the array has a 'tweet' attribute and all of the metadata is nested inside it.
I've modified line 502 in twitter.py
to:
t = Status.parse(self.api, tweet['tweet'])
That has got it working. I'm not sure whether there has been a recent change in the JSON structure for the tweet archives or if something else is going on.
Thanks for the work on this PR and for the tool itself.
Tried this to import from archive, but got an error: File "/home/debian/.local/lib/python3.7/site-packages/semiphemeral/db.py", line 52, in init self.created_at = status.created_at AttributeError: 'Status' object has no attribute 'created_at'
I just tried importing from archive and got the same error. It looks like an issue with parsing the JSON object for each tweet. In my
tweet.js
file, each object in the array has a 'tweet' attribute and all of the metadata is nested inside it.I've modified line 502 in
twitter.py
to:t = Status.parse(self.api, tweet['tweet'])
That has got it working. I'm not sure whether there has been a recent change in the JSON structure for the tweet archives or if something else is going on.
Thanks for the work on this PR and for the tool itself.
That worked for me. Thank you!
This is really great, thank you for the branch (and thanks for the semiphemeral project as well!).
For anyone trying to run this with recent commits, I resolved a few merge conflicts in this branch: https://github.com/mattnworb/semiphemeral/tree/archive-import-merged
Can this PR get rebased please? It'd be nice to get this landed. :)
Due to the nature of this feature, I don't really need it now, but I'm happy to rebase it and apply the above changes if I get some indication from @micahflee that it'll be merged.
I can do the merge, but there's been a lot of code changes in the last few years ;)
I can do the merge, but there's been a lot of code changes in the last few years ;)
Oops, I missed that you're a committer! I've just pushed a rebase onto the latest HEAD (also including encoding="UTF-8"
) but I'm waiting on Twitter to finish an export so I can actually test it.
I also note there's a related feature for DMs now. I guess it would be ideal to combine these features but I'm not sure I have the time/inclination to do so.
PS C:\Users\Alex\semiphemeral> python app.py import C:\Users\Alex\Downloads\archive\data
semiphemeral 0.7
Importing 35600 tweets from C:\Users\Alex\Downloads\archive\data
Traceback (most recent call last):
File "C:\Users\Alex\semiphemeral\app.py", line 4, in <module>
semiphemeral.main()
File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1128, in __call__
return self.main(*args, **kwargs)
File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1053, in main
rv = self.invoke(ctx)
File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1659, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "C:\Users\Alex\semiphemeral\semiphemeral\__init__.py", line 81, in archive_import
t.import_dump(path)
File "C:\Users\Alex\semiphemeral\semiphemeral\twitter.py", line 800, in import_dump
self.import_tweets(
AttributeError: 'Twitter' object has no attribute 'import_tweets'
Any idea why this happens whenever I'm trying to import the archive?
I see the same output as @ctrlBIRDdelete (although in macOS). Any ideas on how to fix?
PS C:\Users\Alex\semiphemeral> python app.py import C:\Users\Alex\Downloads\archive\data semiphemeral 0.7 Importing 35600 tweets from C:\Users\Alex\Downloads\archive\data Traceback (most recent call last): File "C:\Users\Alex\semiphemeral\app.py", line 4, in <module> semiphemeral.main() File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1128, in __call__ return self.main(*args, **kwargs) File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1053, in main rv = self.invoke(ctx) File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 1395, in invoke return ctx.invoke(self.callback, **ctx.params) File "C:\Users\Alex\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\click\core.py", line 754, in invoke return __callback(*args, **kwargs) File "C:\Users\Alex\semiphemeral\semiphemeral\__init__.py", line 81, in archive_import t.import_dump(path) File "C:\Users\Alex\semiphemeral\semiphemeral\twitter.py", line 800, in import_dump self.import_tweets( AttributeError: 'Twitter' object has no attribute 'import_tweets'
Any idea why this happens whenever I'm trying to import the archive?
I importer my archive, ran delete. Then downloaded a new version of my archive and wanted to import it again and got exactly your error. I simply restarted my MacBook and now it's running fine. Give it a try maybe :)
This PR is a bit messy for reasons mentioned below. It's working fine for me so I'm going to raise this PR mostly so I don't forget about it.
This PR adds a feature to use a Twitter archive export to initially populate the semiphemeral DB. This has the following advantages:
You run it by requesting your archive, unzipping it, and then running
semiphemeral import path/to/archive_dir
.Issues
This depends on #22.