expectocode / telegram-export

Export Telegram chat data and history
Mozilla Public License 2.0
457 stars 100 forks source link

Add support for NULL values as first name/last name #71

Open Aleyasen opened 6 years ago

Aleyasen commented 6 years ago

It seems in Telegram it's possible to have NULL values for first name/last name, but it causes telegram-export to error out.

2018-07-24 17:59:11,356 - telegram_export.downloader - INFO - Getting participants...
2018-07-24 17:59:12,528 - telegram_export.downloader - INFO - Saved 436 new members, 0 left the chat.
2018-07-24 17:59:42,773 - telegram_export.downloader - INFO - Done. Retrieving full information about 515 missing entities.                                                        
2018-07-24 18:03:42,548 - telegram_export.dumper - ERROR - Integrity error: NOT NULL constraint failed: User.FirstName                                                             
2018-07-24 18:03:42,549 - asyncio - ERROR - Task exception was never retrieved                                                                                                     
future: <Task finished coro=<Downloader._user_consumer() done, defined at /usr/local/lib/python3.7/site-packages/telegram_export/downloader.py:316> exception=IntegrityError('NOT NULL constraint failed: User.FirstName')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/telegram_export/downloader.py", line 320, in _user_consumer
    functions.users.GetFullUserRequest(await queue.get())
  File "/usr/local/lib/python3.7/site-packages/telegram_export/downloader.py", line 102, in _dump_full_entity
    self.dumper.dump_user(entity, photo_id=photo_id)
  File "/usr/local/lib/python3.7/site-packages/telegram_export/dumper.py", line 404, in dump_user
    where=('ID', user_full.user.id))
  File "/usr/local/lib/python3.7/site-packages/telegram_export/dumper.py", line 824, in _insert_if_valid_date
    return self._insert(into, values)
  File "/usr/local/lib/python3.7/site-packages/telegram_export/dumper.py", line 834, in _insert
    .format(into, fmt), values)
sqlite3.IntegrityError: NOT NULL constraint failed: User.FirstName
2018-07-24 18:18:46,284 - asyncio - ERROR - Task exception was never retrieved                                                                                                     
future: <Task finished coro=<Downloader._media_consumer() done, defined at /usr/local/lib/python3.7/site-packages/telegram_export/downloader.py:305> exception=TypeError("'NoneType' object is not subscriptable")>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/telegram_export/downloader.py", line 311, in _media_consumer
    bar)
  File "/usr/local/lib/python3.7/site-packages/telegram_export/downloader.py", line 226, in _download_media
    media_type = media_row[3].split('.')
TypeError: 'NoneType' object is not subscriptable
Lonami commented 6 years ago

It seems in Telegram it's possible to have NULL values for first name

Only for deleted accounts.

Aleyasen commented 6 years ago

@Lonami Thanks for the investigation. So the workaround is to ignore deleted accounts completely? or relax the NOT NULL constraint on the database?

Lonami commented 6 years ago

@Aleyasen either option is valid.

expectocode commented 6 years ago

@Aleyasen Can you see if the fix-71 branch fixes this?

Aleyasen commented 6 years ago

Sorry, @expectocode for my late reply. I tested it and unfortunately, the issue is still there.

python3 telegram_export  --config-file config.ini
2018-08-18 14:48:41,906 - exporter - INFO - Saving to /opt/ws/telegram-export
2018-08-18 14:48:42,125 - telegram_export.downloader - INFO - Getting participants...
2018-08-18 14:48:42,841 - telegram_export.downloader - INFO - Saved 447 new members, 0 left the chat.
2018-08-18 14:49:14,305 - telegram_export.downloader - INFO - Done. Retrieving full information about 576 missing entities.                                                        
2018-08-18 14:54:55,124 - telegram_export.dumper - ERROR - Integrity error: NOT NULL constraint failed: User.FirstName                                                             
2018-08-18 14:54:55,125 - asyncio - ERROR - Task exception was never retrieved                                                                                                     
future: <Task finished coro=<Downloader._user_consumer() done, defined at /usr/local/lib/python3.7/site-packages/telegram_export/downloader.py:316> exception=IntegrityError('NOT NULL constraint failed: User.FirstName')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/telegram_export/downloader.py", line 320, in _user_consumer
    functions.users.GetFullUserRequest(await queue.get())
  File "/usr/local/lib/python3.7/site-packages/telegram_export/downloader.py", line 102, in _dump_full_entity
    self.dumper.dump_user(entity, photo_id=photo_id)
  File "/usr/local/lib/python3.7/site-packages/telegram_export/dumper.py", line 404, in dump_user
    where=('ID', user_full.user.id))
  File "/usr/local/lib/python3.7/site-packages/telegram_export/dumper.py", line 824, in _insert_if_valid_date
    return self._insert(into, values)
  File "/usr/local/lib/python3.7/site-packages/telegram_export/dumper.py", line 834, in _insert
    .format(into, fmt), values)
sqlite3.IntegrityError: NOT NULL constraint failed: User.FirstName
entities:  42%|████████████████████████████████████                                                  | 253/603 [06:13<08:47,  0.66 entities/s, chat=XXX
media: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 45.2k/45.2k [06:13<00:00, 1.32kB/s, chat=XXX
mariodigital commented 5 years ago

@Aleyasen either option is valid.

Can you blacklist Deleted Accounts? It's kind of challenging purging all deleted accounts in larger group, so if they could just be blacklisted somehow, that would make life a lot easier.

noameppel commented 5 years ago

Also having an issue with this. Is it possible to add an option --ignore-missing-entities etc?