Closed shymu closed 6 years ago
The backup file should be okay. If the backup or attachment was corrupted, it would've been a more spectacular failure.
extract
uses MIME types to determine output file extensions. I'm not sure why it doesn't have a recognised MIME type, but I've pushed a fix that will just give debug output instead. Are you able to build the program from source to have another try?
I was able to build from source, but it doesn't give me much more info, just a timestamp:
$ ./signal-back extract -p [pass] signal-[date].backup
2018/04/30 07:53:30 encoding `` not recognised. create a PR or issue if you think it should be
I made the wrong change. In cmd/extract.go
, can you change the two instances of log.Fatalf
at lines 224 and 225 to log.Printf
instead, then try again? Optimally you should still get a file spat out, but it just won't have an extension. If you can check what that file's supposed to be, I can figure out what the issue is.
Ok, now I get 504 lines of this, before it finally ends (here's just a snippet from the end):
2018/04/30 08:06:09 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 08:06:09 if you can provide details on the file `1495031440126` as well, it would be appreciated
2018/04/30 08:06:09 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 08:06:09 if you can provide details on the file `1495036527311` as well, it would be appreciated
2018/04/30 08:06:09 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 08:06:09 if you can provide details on the file `1495064583717` as well, it would be appreciated
2018/04/30 08:06:09 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 08:06:09 if you can provide details on the file `1495165660156` as well, it would be appreciated
error: failed to extract attachment: failed to open output file: open 1495165660156.: too many open files
How do I figure out what the file is supposed to be? Without an extension I'm not sure where to start...
MacOS (iirc) determines file type by encoding rather by extension (like Windows does), so you could just try opening them from Finder. If that doesn't work, try a text editor and see if they're at least plaintext. If that also doesn't work, I might need to check that I'm extracting from the right place in the backup.
Another alternative might be to change around line 72 to be the following:
if len(ps) == 25 {
aEncs[*ps[19].IntegerParameter] = *ps[3].StringParamter
if *ps[19].IntegerParameter == 1495165660156 {
fmt.Printf("%v", ps)
}
}
or change 1495165660156 to one of the other numbers it spat out.
Note that that might contain sensitive information, so censor at will if you need to.
Unless I’m missing something (I’m assuming the files are supposed to extract to the same folder as the backup file?) nothing actually extracts so there is no file to check, all I seem to have to go on is this debug output
Ah, so it hasn't. Missed that last line.
The fix is in master, or just add file.Close()
above line 86.
Hi!
Same warnings for me on MacOS. I compiled your updated code and extracted every file of my backup. Attachment files are extracted without any extension. They are JPG, PNG and MPEG files.
If you can build and run the code in the devel
branch, that might give me some insight. It should print out the whole entry if there's a missing encoding type.
Hey sorry, there was some degree of user error on my part, these files actually were extracting to my $GOPATH/bin/ (since that's where I was running the command apparently) and not where my backup was located.
Similar to @neurolit's findings, the files seem to be a mix of JPG, PNG and MPEG, all without extensions.
Anyway, I pulled the latest master, built, set ulimit -n 1024
, and tried again.
This time it "completed" but every file still dumped something like this to the console:
2018/04/30 20:57:01 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 20:57:01 if you can provide details on the file `1522622403410` as well, it would be appreciated
The referenced file, 1522622403410
is able to be previewed in finder (though it seems to lack any MIME data) and if I add a .jpg to the filename I can open it just fine.
There are some files that finder doesn't seem to know what to do with and I can't figure out what they are either. For example:
$ file -I 1522811988909
1522811988909: application/octet-stream; charset=binary
$ hexdump -C 1522811988909
00000000 a4 12 b0 10 24 30 53 d6 0e a4 a9 d4 30 9f 23 99 |....$0S.....0.#.|
00000010 1a 20 a4 bd eb d2 71 73 cd 2b b4 3c f8 cd 6a 40 |. ....qs.+.<..j@|
00000020 f7 8d 62 c2 d7 05 0e 38 22 3a b3 b8 f3 91 b8 4f |..b....8":.....O|
00000030 5c 10 5b 5a d4 ea a0 8c c3 c6 cd 7b 4b c8 33 87 |\.[Z.......{K.3.|
00000040 0e 46 e2 e4 a9 02 0a 63 f9 b6 bd c6 72 52 04 9c |.F.....c....rR..|
00000050 e5 08 cb 7e 47 65 93 8c 36 10 a5 74 bd 5c c6 9d |...~Ge..6..t.\..|
00000060 81 58 e0 d0 1c 18 96 2e 68 2b 9c bb d3 d9 12 17 |.X......h+......|
00000070 8c 65 c8 9d 20 b7 ce 69 34 ef 33 42 bf b7 37 b7 |.e.. ..i4.3B..7.|
00000080 13 b4 36 1a 40 c3 32 55 f5 1f 7b 25 6a 8c 1f e1 |..6.@.2U..{%j...|
00000090 6d 14 39 a3 d3 ad 78 e6 73 9f 86 fb 61 40 c5 74 |m.9...x.s...a@.t|
000000a0 e1 31 92 14 c0 e5 44 63 40 d9 de a6 82 94 05 4a |.1....Dc@......J|
000000b0 e2 b0 76 42 65 47 cf b0 97 a0 5d a7 91 6a 41 21 |..vBeG....]..jA!|
000000c0 09 1c 15 5d 30 7c a5 41 41 97 14 91 c6 8e d9 d1 |...]0|.AA.......|
000000d0 35 df 03 45 23 20 8d ef 60 7d d8 17 ff 9e 5c 16 |5..E# ..`}....\.|
000000e0 63 23 e0 e0 be 99 bd 18 24 c1 87 72 a4 c2 76 df |c#......$..r..v.|
000000f0 18 1b d8 1d 0e 46 7d c9 2c b4 3f e4 46 1d 14 b9 |.....F}.,.?.F...|
00000100 08 da 24 54 d2 c1 d7 ae f7 46 30 2c b7 9a 14 e9 |..$T.....F0,....|
<snip>
Can you run the same on the devel
branch? It should give a little more debugging output.
Unfortunately I've only been using git since I filed this bug, so forgive me for having such a basic issue, but I'm not sure how to pull the devel
branch? So far I've tried:
$ git checkout devel
error: pathspec 'devel' did not match any file(s) known to git.
to no avail :(
All good.
$ git pull origin devel # or git clone https://github.com/xeals/signal-back --branch devel
$ git checkout devel
I don't seem to be getting any additional output w/ the devel
branch:
2018/04/30 21:18:11 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 21:18:11 if you can provide details on the file `1522811988909` as well, it would be appreciated
I did confirm those 3 line changes to extract.go were in the new file after pulling devel
and building, so I'm fairly certain I'm using the right branch.
That's even more interesting. I've just hard-coded that file number, so pull and try again.
Nothing new
2018/04/30 21:24:34 encoding `` not recognised. create a PR or issue if you think it should be
2018/04/30 21:24:34 if you can provide details on the file `1522811988909` as well, it would be appreciated
Oh, that's more interesting.
I was under the impression that every attachment (stored as a binary blob in the backup) came with a matching parts
SQL query containing metadata. If it's not running the part I've been changing, there's no metadata entry (or it's out of order). The next change will just spit everything it finds. If you can put the entire thing into a gist/pastebin/etc. that'd be great. It shouldn't contain anything sensitive.
Also, could you provide the second line of the output of signal-back analyze
? Should start with map
and have a bunch of key/value pairs.
I tried with the devel
branch:
With extract
:
found attachment binary 1524953561841
2018/05/02 14:41:28 encoding `` not recognised. create a PR or issue if you think it should be
2018/05/02 14:41:28 if you can provide details on the file `1524953561841` as well, it would be appreciated
And with analyze
:
map[insert_into_part:689 attachment:688 insert_into_identities:4 pref:2 create_index:19 drop_index:19 insert_into_thread:9 drop_table:13 create_table:13 insert_into_sms:32039 insert_into_mms:691 insert_into_recipient_preferences:538 avatar:1 version:1]
part: 27 statement:"INSERT INTO part VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)" parameters:<integerParameter:1 > parameters:<integerParameter:2 > parameters:<integerParameter:0 > parameters:<stringParamter:"video/mp4" > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<integerParameter:0 > parameters:<stringParamter:"/data/user/150/org.thoughtcrime.securesms/app_parts/part1361163667.mms" > parameters:<integerParameter:143930 > parameters:<stringParamter:"20170408_161034.mp4" > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<integerParameter:1491660667638 > parameters:<nullparameter:true > parameters:<nullparameter:true > parameters:<integerParameter:0 > parameters:<blobParameter:"-\351\276?\021\253\025\\\324\344E5q\300f\262\340\344v\022|\225:\244k\\\374\266\031\026G\211" > parameters:<nullparameter:true > parameters:<integerParameter:0 > parameters:<integerParameter:0 >
Is that the entire output of extract
?
edit: I guess it's only the errored bit. For some reason one attachment doesn't have a matching metadata entry in yours. I might be able to rig up some sort of detection method if it's missing, but it might not work.
I have 688 attachments (approx. 2000 lines of logs). For each attachment, I've got these 4 lines, yes.
If it's not giving any lines starting with found attachment metadata
, then I'm at a loss.
I've pushed a change that tries to guess encoding based on the file contents. You'll need to dep ensure
before trying to run again.
I confirm I have no found attachment metadata
line. I'll try your new code.
It works!
Logs (for one attachment):
2018/05/02 15:24:28 found attachment binary 1524953561841
2018/05/02 15:24:28 file `1524953561841` has no associated SQL entry; going to have to guess at its encoding
Files have now the right extension, except for three of them (*.unknown files):
$ file *.unknown
1498568946944.unknown: MPEG ADTS, AAC, v4 LC, 44.1 kHz, monaural
1519327446092.unknown: MPEG ADTS, AAC, v4 LC, 44.1 kHz, monaural
1522171621751.unknown: ISO Media
Good. Thanks for your help with this. I don't think there's much I can do about those files myself, but it might be worth dropping an issue over at the upstream repo with the file types and initial X bytes if you're willing.
I'm trying to extract a backup made on Signal 4.18.3 on a Samsung Galaxy Alpha.
The backup completed (seemingly) successfully on the phone and the backup file was then transferred to a MacBook Pro using Google's "Android File Transfer" app.
Running signal-back with the "analyse" option on this file seems to work just file, which leads me to believe the backup file is good, when I try to run an extract I get this error:
I can't figure out if this is an error related to the backup file or signal-back, or how to further diagnose this.