ownaginatious / fbchat-archive-parser

An application for parsing chat history from a Facebook data archive.
MIT License
312 stars 38 forks source link

Every conversation in separate file #38

Closed Planbos closed 7 years ago

Planbos commented 7 years ago

I am suggesting that this script splits conversations each in separate file. Can this feature be added to your script somehow?

ownaginatious commented 7 years ago

I've implemented this feature and released it in version 0.9.post9.

You can try it by specifying the -d or --directory option, which outputs to a directory:

fbcap messages.htm --directory /some/random/directory

That will create a subdirectory with all your conversations named in the format thread_#.ext and a file called manifest.txt which lists the conversation participants in each file.

Let me know if that addresses all your needs.

Planbos commented 7 years ago

Dear ownaginatious, I was out of the office for couple of days so ... I tried this -d options but I am receiving same error with different messages.htm files...

Errors look like this:

Traceback (most recent call last):                                                                                                        
  File "/usr/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.9.post12', 'console_scripts', 'fbcap')()
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 188, in main
    app.run()
  File "/usr/lib/python2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/lib/python2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/lib/python2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 131, in fbcap
    write(format, fbch, directory or sys.stdout)
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 35, in write
    write_to_dir(selected_writer(), stream_or_dir, data)
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 46, in write_to_dir
    except FileNotFoundError:
NameError: global name 'FileNotFoundError' is not defined

Greetings and sorry for delay, PlanB

ownaginatious commented 7 years ago

No problem. Could you try the latest version and see if it works for you now? (0.9.post15)

Planbos commented 7 years ago

Sorry, I`ve tried, but I have errors as last time:

Continuing chat thread with [11111111111111@facebook.com, 11111111111111@facebook.com] <@3862 messages>Continuing chat thread with [10DiscoveredContinuing chat thread with [11111111111111@facebook.com, 1000114Discovered chat thread with [11111111111111@facebook.com, 11111111111111@facebook.com]...            Continuing chat thread with [11111111111111@facebook.com, 11111111111111@facebook.com] <@1206 messages>Discovered chat thread with [11111111111111@facebook.com]...                                            Skipping chat thread with unknown participants...                                                        Continuing chat thread with [22222222222222222@facebook.com, 22222222222222222@facebook.com] <@16 messages>..Discovered chat thread with [22222222222222222@facebook.com, 22222222222222222@facebook.com]...              Continuing chat thread with [22222222222222222@facebook.com, 22222222222222222@facebook.com] <@4808 messages>                                                                                                         Traceback (most recent call last):
  File "/usr/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.9.post15', 'console_scripts', 'fbcap')()
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 188, in main
    app.run()
  File "/usr/lib/python2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/lib/python2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/lib/python2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 131, in fbcap
    write(format, fbch, directory or sys.stdout)
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 35, in write
    write_to_dir(selected_writer(), stream_or_dir, data)
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 51, in write_to_dir
    shutil.rmtree(directory)
  File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree
    names = os.listdir(path)
OSError: [Errno 2] No such file or directory: '/home/bosti/Dokumenti/test//fbchat_dump_201705051419'

Greetings, PlanB

Planbos commented 7 years ago

I've changed FB IDs for privacy reasons!

ownaginatious commented 7 years ago

Whoops, bug due to insufficient testing. Please try 0.9.post16 :)

Planbos commented 7 years ago

As you see... almost the same error! I think the problem is not general, because I tried to parse different messages.htm files and with ones your script works perfectly, with others put out error as follows:

Continuing chat thread with [100001111111111@facebook.com, 100022222222222@facebook.com] <@3862 messages>Continuing chat thread with [100001111111111@facebook.com, 100022222222222@facebook.com] <@13862 messagesDiscovered chat thread with [10000077777777@facebook.com, 100001111111111@facebook.com]...              Continuing chat thread with [100001111111111@facebook.com, 10000666666666@facebook.com] <@2185 messages>Discovered chat thread with [100001111111111@facebook.com, 10005555555555555@facebook.com]...              Continuing chat thread with [100001111111111@facebook.com, 100022222222222@facebook.com] <@23862 messagesDiscovered chat thread with [100001111111111@facebook.com, 1000888888888@facebook.com]...              Continuing chat thread with [10000077777777@facebook.com, 100001111111111@facebook.com] <@1206 messages>Discovered chat thread with [100000000000000@facebook.com]...                                            Skipping chat thread with unknown participants...                                                        Continuing chat thread with [1000888888888@facebook.com, 100001111111111@facebook.com] <@16 messages>..Discovered chat thread with [100004444444444@facebook.com, 100001111111111@facebook.com]...              Continuing chat thread with [100001111111111@facebook.com, 1000888888888@facebook.com] <@4808 messages>                                                                                                         Traceback (most recent call last):
  File "/usr/bin/fbcap", line 9, in <module>
    load_entry_point('fbchat-archive-parser==0.9.post16', 'console_scripts', 'fbcap')()
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 188, in main
    app.run()
  File "/usr/lib/python2.7/site-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/lib/python2.7/site-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/lib/python2.7/site-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/main.py", line 131, in fbcap
    write(format, fbch, directory or sys.stdout)
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 38, in write
    write_to_dir(selected_writer(), stream_or_dir, data)
  File "/usr/lib/python2.7/site-packages/fbchat_archive_parser/writers/__init__.py", line 58, in write_to_dir
    manifest.write("Chat history manifest for: %s\n\n" % data.user)
TypeError: write() argument 1 must be unicode, not str

SO...script works, but not with every messages.htm file!

Thanx again, PlanB If you will find solution I will be very thankful, if not it will also be ok...

ownaginatious commented 7 years ago

Ah, interesting problem. I think this should fix it: 0.9.post17

Planbos commented 7 years ago

Great! Working now! Thanx! I can`t thank you enough, this script will improve my ability to analyse FB messages. Greetings, PlanB

ownaginatious commented 7 years ago

Glad to see it works for you now :) Thanks for taking the time to report all the problems you found :+1:

Planbos commented 7 years ago

If you are interested, I am having issues when combining commands like:

root@kali:~/Documents# **fbcap messages.htm --resolve -d /root/Documents/test**
Discovered chat thread with [Mag Pe]...Traceback (most recent call last):
  File "/usr/local/bin/fbcap", line 11, in <module>
    load_entry_point('fbchat-archive-parser==0.9.post17', 'console_scripts', 'fbcap')()
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/main.py", line 188, in main
    app.run()
  File "/usr/local/lib/python2.7/dist-packages/clip.py", line 652, in run
    self.invoke(self.parse(tokens))
  File "/usr/local/lib/python2.7/dist-packages/clip.py", line 634, in invoke
    self._main.invoke(parsed)
  File "/usr/local/lib/python2.7/dist-packages/clip.py", line 519, in invoke
    self._callback(**{k: v for k, v in iteritems(parsed) if k not in self._subcommands})
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/main.py", line 125, in fbcap
    fbch = parser.parse()
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 102, in parse
    self._parse_content()
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 128, in _parse_content
    self._process_element(pos, element)
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 203, in _process_element
    participants = self._parse_participants(e)
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/parser.py", line 179, in _parse_participants
    for p in participants_text.split(", ")]
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/name_resolver.py", line 155, in resolve
    return self._manual_lookup(facebook_id, facebook_id_string)
  File "/usr/local/lib/python2.7/dist-packages/fbchat_archive_parser/name_resolver.py", line 136, in _manual_lookup
    allow_redirects=True, timeout=10
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 501, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 630, in send
    history = [resp for resp in gen] if allow_redirects else []
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 190, in resolve_redirects
    **adapter_kwargs
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 641, in send
    r.content
  File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 797, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 726, in generate
    raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.facebook.com', port=443): Read timed out.

It is not crucial for me, but I saw you are examining a lot and if this is something I can help, I can copy paste eroors :-)

ownaginatious commented 7 years ago

Does that happen every time? It looks like the tool is simply getting timeouts when connecting to Facebook. Are you able to access https://www.facebook.com on the computer you're running this on?

Planbos commented 7 years ago

Sorry, I tried again and now goes without errors... Thnx.