TeamWin / Team-Win-Recovery-Project

Core recovery files for the Team Win Recovery Project (T.W.R.P) - this is not up to date, please see https://github.com/TeamWin/android_bootable_recovery/
http://twrp.me
1.95k stars 741 forks source link

extractTarFork() process ended with ERROR: 255 #964

Closed dexbyte closed 5 years ago

dexbyte commented 7 years ago

E: ADB Restore Failed. I had the same problem with TWRP 3.1.0 and TWRP 3.1.1 didn't fix this bug either. I am trying to restore the backup created via adb to pc. Restore always fails.

sav-valerio commented 5 years ago

Can confirm other reports, make sure that you have enough free space inside your Internal Storage. (even if the backup is of the data partition) Maybe it's used for some kind of temporary storage for extracting the backup. I just moved the backup onto an external HD (USB OTG) and the restore worked flawlessly.

HaleTom commented 5 years ago

This XDA thread proposes that the source of the issue is parallel apps.

I did at one stage create another user on my phone, and am now seeing this issue.

Happyfeet01 commented 5 years ago

The same here, but no multiple Users. https://github.com/TeamWin/android_device_google_sailfish/issues/3

devon-ge commented 5 years ago

Agree with @sav-valerio. My Huawei H60-L11 (Android 6) has only 3900M storage (data partition) left. When the restoring approaches 3900M, it stopped with extractTarFork() process ended with ERROR: 255. So make sure you have enough storage space.

Murrfk commented 5 years ago

I had this problem and it took me two days to resolve it. I had a good backup and I know it was good because I had used it several times and it worked. But then after doing some system installs I got the error 255. I tried several things to fix it and nothing worked and I really did not want to lose my backup. What did work for me was to use TWRP wipe function to FORMAT the phone. Wiping or factory reset did not work. I lost the contents of my internal SD card but I was able to transfer most to my external SD card. You could also connect the phone with a cable to a computer and use the pull command in ADB to backup the contents f the internal SD card. Given the choice between losing the SD card and the data files I come to lose some of the SD card. Anyway, a FORMAT worked when nothing else did. I read that this error is caused by a lack of space. When I checked the log it seemed that the restore was failing with a particular file which misled me to thinking I had to remove the corrupt file, but since after the format I was able to restore without any problem, it would seem that there was no corrupt data. Hope this helps someone.

samarium commented 5 years ago

I had error 255 issue on restore recently from TWRP GUI. However if I manually extracted each of the win0?? backup files using tar to a freshly wiped data partition, then the recovery was OK.

I didn't see anything that would suggest corrupt tar archives as above, apart from multiple "tar: Malformed extended header: missing equal sign" errors when I tried to tar tvf on linux desktop. When I tried tar tvf from TWRP booted adb shell session then it worked fine.

I was making the backups with TWR.P 3.1.1-0, and TWRP recovery failed for TWRP 3.1.1-0 and TWRP 3.2.3-0. Even using TWRP 3.2.3-0, the win000 file created causes the /system/bin/tar from LineageOS 14.1 to abort, so I maybe TWRP 3.2.3-0 will abort too? Haven't tested it yet.

theorangepotato commented 5 years ago

I am also seeing the extractTarFork() process ended with ERROR: 255 error. I have cleared out 10G of space, and it still occurs, so it is not a space issue. It always gets to about 97% before failing. I both made the backup, and am trying to restore it, with the twrp-3.2.3-1-sailfish.img GUI, not adb.

The last few lines of /tmp/recovery.log show the following:

  ==> extracting: //data/misc_ce/0/wifi/WifiConfigStore.xml (file size 844 bytes)
  ==> extracting: //data/user/ (mode 40711, directory)
  ==> extracting: //data/user/0 (symlink to /data/data)
  ==> extracting: //data/user_de/ (mode 40711, directory)
  ==> extracting: //data/user_de/0/ (mode 40771, directory)
restoring policy 1DE0 > 'ac26c78c10ce0000' to '//data/user_de/0/'
  ==> extracting: //data/user_de/0/com.google.android.ext.services/ (mode 40700, directory)
tar_extract_file(): failed to extract //data/user_de/0/com.google.android.ext.services/ !!!
I:Unable to extract tar archive '/data/media/0/TWRP/BACKUPS/FA6920315332/2019-04-10--13-18-04_lineage_sa
ilfish-userdebug_810_OPM718120500/data.ext4.win004'
Error during restore process.
I:Error extracting '/data/media/0/TWRP/BACKUPS/FA6920315332/2019-04-10--13-18-04_lineage_sailfish-userde
bug_810_OPM718120500/data.ext4.win004' in thread ID 0
I:Error extracting split archive.
Error during restore process.
extractTarFork() process ended with ERROR: 255
I:Set page: 'action_complete'
I:operation_end - status=1
I:Set overlay: ''
I:TWFunc::Set_Brightness: Setting brightness control to 5
I:TWFunc::Set_Brightness: Setting brightness control to 0

So, the error is occurring in my data.ext4.win004. I have tried manually extracting that file with GNU tar on my linux machine, and besides a lot of these errors:

tar: Ignoring unknown extended header keyword 'TWRP.security.e4crypt'
tar: Malformed extended header: missing equal sign

it seems to successfully extract. However, when it is extracted, there in fact does not seem to be a //data/user_de/0/com.google.android.ext.services/! However, there is a //data/user_de/0/android.ext.services/, but I don't know if this is the source of the confusion. It's hard to tell if it's a failure to extract it on my machine, or if the folder just doesn't exist in the backup.

Any help would be greatly appreciated.

UPDATE: When I try to extract data.ext4.win004 on the device, it outputs the following:

tar: /data/user_de/0/com.google.android.ext.services: can't create: No such file or directory
tar: chown 10049:10049 '/data/user_de/0/com.google.android.ext.services': No such file or directory
tar: :/data/user_de/0/com.google.android.ext.services: not created
tar: :/data/user_de/0/com.google.android.ext.services: not created
tar: /data/user_de/0/com.android.providers.telephony: can't create: No such file or directory
tar: chown 1001:1001 '/data/user_de/0/com.android.providers.telephony': No such file or directory
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: invalid tar format

So it looks like it's an issue with the tar itself, maybe the headers?

Murrfk commented 5 years ago

I am also seeing the extractTarFork() process ended with ERROR: 255 error. I have cleared out 10G of space, and it still occurs, so it is not a space issue. It always gets to about 97% before failing. I both made the backup, and am trying to restore it, with the twrp-3.2.3-1-sailfish.img GUI, not adb.

The last few lines of /tmp/recovery.log show the following:

  ==> extracting: //data/misc_ce/0/wifi/WifiConfigStore.xml (file size 844 bytes)
  ==> extracting: //data/user/ (mode 40711, directory)
  ==> extracting: //data/user/0 (symlink to /data/data)
  ==> extracting: //data/user_de/ (mode 40711, directory)
  ==> extracting: //data/user_de/0/ (mode 40771, directory)
restoring policy 1DE0 > 'ac26c78c10ce0000' to '//data/user_de/0/'
  ==> extracting: //data/user_de/0/com.google.android.ext.services/ (mode 40700, directory)
tar_extract_file(): failed to extract //data/user_de/0/com.google.android.ext.services/ !!!
I:Unable to extract tar archive '/data/media/0/TWRP/BACKUPS/FA6920315332/2019-04-10--13-18-04_lineage_sa
ilfish-userdebug_810_OPM718120500/data.ext4.win004'
Error during restore process.
I:Error extracting '/data/media/0/TWRP/BACKUPS/FA6920315332/2019-04-10--13-18-04_lineage_sailfish-userde
bug_810_OPM718120500/data.ext4.win004' in thread ID 0
I:Error extracting split archive.
Error during restore process.
extractTarFork() process ended with ERROR: 255
I:Set page: 'action_complete'
I:operation_end - status=1
I:Set overlay: ''
I:TWFunc::Set_Brightness: Setting brightness control to 5
I:TWFunc::Set_Brightness: Setting brightness control to 0

So, the error is occurring in my data.ext4.win004. I have tried manually extracting that file with GNU tar on my linux machine, and besides a lot of these errors:

tar: Ignoring unknown extended header keyword 'TWRP.security.e4crypt'
tar: Malformed extended header: missing equal sign

it seems to successfully extract. However, when it is extracted, there in fact does not seem to be a //data/user_de/0/com.google.android.ext.services/! However, there is a //data/user_de/0/android.ext.services/, but I don't know if this is the source of the confusion. It's hard to tell if it's a failure to extract it on my machine, or if the folder just doesn't exist in the backup.

Any help would be greatly appreciated.

UPDATE: When I try to extract data.ext4.win004 on the device, it outputs the following:

tar: /data/user_de/0/com.google.android.ext.services: can't create: No such file or directory
tar: chown 10049:10049 '/data/user_de/0/com.google.android.ext.services': No such file or directory
tar: :/data/user_de/0/com.google.android.ext.services: not created
tar: :/data/user_de/0/com.google.android.ext.services: not created
tar: /data/user_de/0/com.android.providers.telephony: can't create: No such file or directory
tar: chown 1001:1001 '/data/user_de/0/com.android.providers.telephony': No such file or directory
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: :/data/user_de/0/com.android.providers.telephony: not created
tar: invalid tar format

So it looks like it's an issue with the tar itself, maybe the headers?

How big is your backup? 10 Gigs might not be enough to allow space for the backup. Did you try what I suggested previously?

theorangepotato commented 5 years ago

@Murrfk Yes, I did a full wipe, only places the backup on, and did the restore. Sadly, it stopped at the same place it always had. I also tried a factory reset, wiping, flashing, pretty much every combination I could think of.

samarium commented 5 years ago

IIRC I was able to extract when I connected to recovery mode via adb and manually extracted the backups using tar on the phone.

Each .win00* file is just a tar file, optionally compressed, so you just have to extract it using tar, and tar handles the compression if needed.

I might have had to switch to an older or newer version of TWRP, don't remember.

It seems the problem is in the TWRP recovery program which uses internal tar code, and doesn't use the normal tar binary.

TjrGithub commented 5 years ago

This bug bit me with official twrp-3.2.3-2-mido. Now I have a phone without a modem.

EDIT: At least two other people have had the same problem. There might be somthing about the /dev/block/bootdevice/by-name/modem partition to trigger this bug.

https://forum.xda-developers.com/showpost.php?p=72742481&postcount=9

crok commented 5 years ago

This bug bit me with official twrp-3.2.3-2-mido. Now I have a phone without a modem.

Easy to fix - go to here and grab a firmware zip and flash it https://xiaomifirmwareupdater.com/#stable#mido I personally recommend to use the last chinese stable one: V10.2.2.0.NCFCNXM

null-von-sushi commented 5 years ago

So I too cannot restore my data backup from LOS 16 on Sony XA2 (pioneer). I looked at the two programs mentioned here, but one is for windows only, and the other just produces a copy of the file with no changes.

Is there any way I can actually get TWRP to restore my backup, or is my data lost?

samarium commented 5 years ago

see my comments above about booting phone to recovery, connecting using adb shell, and then using the shell tar command to do the extract.

null-von-sushi commented 5 years ago

okay, I could do that (or just use tar on my computer I guess?) but where do I extract them to? does TWRP just magically pick up the extracted files?

samarium commented 5 years ago

you could try that but of you read my comments above it didn't work for me.

use tar -t to check the path of the saved files and then arrange for them to be extracted into the right place with tar -x

you only need to manual tar for the parts of the backup that twrp fails to restore. For me that was just data. If you are doing a manual restore of data then you don't need to do anything after the restore with twrp.

null-von-sushi commented 5 years ago

I did read your comment, sorry if I am just not understanding it correctly? Or do you mean to skip the TWRP restore and make tar not only uncompress, but also move files to the correct location? (Either way that could work, I just hope it preserves permissions...).

I'll try restoring it later today. Thanks though.

whitedavidp commented 5 years ago

First, I must give my HUGE thanks to both BuildingAtom and venerjoel99 who's efforts really saved my butt. It was not nice to discover, upon trying to restore one of my nightly backups, that it could not complete due to this problem. With their tools, I was able to discover/fix the data.ext4.win00? file that was hosed up.

In my case I used the BigTarCleaner.exe and it reported the problem at the same file in the archive that was last shown being processed in the recovery log file. That was reassuring. So now I am back up and running after the semi-hard brick on by Android TV computer.

I have long been running an open recovery script nightly to perform the backup. And upon reboot after backup completion, I have been checking the hashes to be sure things are ok. Of course, the hashes were ok and I never had any idea the backups would not be restore-able.

So I would like to find a way to test for this condition in my post backup script. I have tried using grep for either "I:Closing tar" or "storing xattr user" and am not finding either despite knowing that a .win00? file is in fact corrupt. So I must be missing something.

So can someone suggest a good way, inside a shell script, to perform a test for this condition?

Can someone suggest a good way to perform the fix, if the above test finds a problem, from inside a shell script?

This is all running, of course, on my Android 5 TV box.

Thanks again!

wget commented 5 years ago

Hello everyone. I'm experiencing this bug on a OnePlus 6T with TWRP 3.3.1-1 (official release available from here).

The problem appears with my system partition. The latter cannot be restored and I'm experiencing the aforementioned error extractTarFork() process ended with ERROR: 255.

Investigations

My investigations led me to a lack of space on the system partition that is being formatted by TWRP during the restore process.

How I fixed my issue ?

I'm not relying on the update feature of TWRP any more and am imaging the current partitions the same way TWRP do for image partitions: using dd. For example,

dd if=/dev/block/bootdevice/by-name/system_b of=/sdcard/backup/YYYY-MM-DD/system_b.bak

To restore the partition in the event I brick my device, I'm just using fastboot:

fastboot flash system /home/wget/[...]/system_b.bak
bigbiff commented 5 years ago

@wget Backing and restore system_image doesn't work?

bigbiff commented 5 years ago

I am closing this because these comments are referencing the same issue.

ADB Backup Possible corruption of backup files Multi-user profiles These are covered under other issues on the tracker.

wget commented 5 years ago

@wget Backing and restore system_image doesn't work?

Of course the backing and restore of the system_image works, since this is based on blocks rather than on files (using dd instead of cp).

Like described above, the problem arises from the way the partitions and symlinks are created and followed. Just that and nothing more :)

wget commented 5 years ago

I am closing this because these comments are referencing the same issue.

ADB Backup Possible corruption of backup files Multi-user profiles These are covered under other issues on the tracker.

For the ease of persons following this bug since it has been split in several sub-bugs, would you mind linking these sub-bugs to this one, or at least giving their links here? That way, we can follow easily where the discussion goes :) Thanks! :)

jxu commented 5 years ago

I had this error on TWRP 3.3.1-0 for my OnePlus 5T (dumpling)

Like @theorangepotato I get lots of errors when using tar on my computer when extracting data.ext4.win. All other partitions seem fine. tar still extracts most files.

tar: Ignoring unknown extended header keyword 'TWRP.security.e4crypt' tar: Malformed extended header: missing equal sign

I can re-tar all the files and TWRP will restore them but then the phone doesn't boot

I restored my phone back to stock using official ROM ZIP https://forum.xda-developers.com/oneplus-5t/how-to/official-oxygenos-4-7-2-7-1-1-ota-t3709265

but I have lost a lot of trust in TWRP...

jxu commented 5 years ago

related #1048 #1452 #1472 #1302 #1279 #1237 #1177 * #1103 #1093 #910 #565 #520 #267 maybe #1256 #783 #745

samarium commented 5 years ago

Try manual tar extract on the phone in the TWRP shell environment, not in the GUI environment. This worked for me.

whitedavidp commented 5 years ago

I thought I should report my findings from testing after finding this issue...

In my case, I was performing the TWRP backup to my device's internal memory and then copying it later to an external SD card and then deleting the one in internal memory. I discovered that the backup while on the internal memory was fine but the one on the external SD was getting corrupted. I have since tossed that SD card and started using another. Since then, I have not been able to detect this problem.

However, I have significantly updated by nightly TWRP backup routine to test all this, try try to "fix" the problem using tools from BuildingAtom and venerjoel99 if needed, and to perform other tests like MD5 checks and tar listings. I do this before copying to external SD and again on the files on the external SD after copying. If I encounter anything wrong, I email myself and do NOT delete the internal memory files.

You just cannot be too careful.

Murrfk commented 5 years ago

My findings were that if there was not enough UNUSED/Empty space in memory to recover the backup you would get that 255 error.

On Mon, Sep 2, 2019 at 12:02 PM whitedavidp notifications@github.com wrote:

I thought I should report my findings from testing after finding this issue...

In my case, I was performing the TWRP backup to my device's internal memory and then copying it later to an external SD card and then deleting the one in internal memory. I discovered that the backup while on the internal memory was fine but the one on the external SD was getting corrupted. I have since tossed that SD card and started using another. Since then, I have not been able to detect this problem.

However, I have significantly updated by nightly TWRP backup routine to test all this, try try to "fix" the problem using tools from BuildingAtom and venerjoel99 if needed, and to perform other tests like MD5 checks and tar listings. I do this before copying to external SD and again on the files on the external SD after copying. If I encounter anything wrong, I email myself and do NOT delete the internal memory files.

You just cannot be too careful.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/TeamWin/Team-Win-Recovery-Project/issues/964?email_source=notifications&email_token=ALG56O52UJRY7PAYX4PQBMDQHVBI3A5CNFSM4DMGJ6ZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5WIMGA#issuecomment-527205912, or mute the thread https://github.com/notifications/unsubscribe-auth/ALG56O6HNHFF5CERZWVQ463QHVBI3ANCNFSM4DMGJ6ZA .

whitedavidp commented 5 years ago

Hello. I do not think the problem with my SD card was lack of space. In fact, I am sure of it. I think the problem was that the card was somehow corrupted. However, having said that, I have run several low level tests on it and they have not shown any problems. I can say that since I swapped the SD card out for another of the same size (but from a more reputable maker) I have not detected any more of these problems using my updated error detection routines. Having said this, I do not think that everyone reporting here has the same trouble as I did. But the symptoms are sure similar. Cheers!

jxu commented 5 years ago

The symptoms are similar because all we get in an exit code that is the same for any error. As mentioned elsewhere TWRP uses a custom tar implementation. I believe debugging would be easier if a standardized implementation of tar or image files produced by dd were used, possibly along with a metadata file with the extended tar headers. But I cannot impose my view on the developers because they understand the project far more than I do.

whitedavidp commented 5 years ago

You are clearly more informed on this than I am. But it sure seems like it is more than just the same error code being returned - at least in my case and others. The ools from BuildingAtom and venerjoel99 both address a sort of corruption found in the tar file and "fixes" it. These "fixes" seem, at least in my case and theirs, to permit the tar files to be extracted. Now perhaps these are all one in the same. And perhaps, as you point out, it is due to a custom tar implementation that is weak somehow. But it sure seems from my experience that there is garbage getting put into the tar files as they are being created that create unexpected conditions during extraction. I would love to see a more robust solution. But I have no idea what that might be or might entail. Cheers

jxu commented 5 years ago

I would not consider myself more knowledgeable about this beyond some basic knowledge about partitions and dd, which is elaborated at the end here https://github.com/TeamWin/Team-Win-Recovery-Project/issues/964#issuecomment-520765983

More discussion here https://github.com/TeamWin/Team-Win-Recovery-Project/issues/1472#issuecomment-527226920

chrcoluk commented 5 years ago

Wow this is extremely concerning, I am surprised it isnt fixed, and there is no publicity on the issue with news sites, XDA etc.

Corrupted backups in a backup program? I mean it almost makes TWRP unfit for purpose.

I too found out today that during a restore, I couldnt restore. DATA partition.

This is potentially very dangerous as well, as the restore process first wipes then restores, so if restre fails old data is gone regardless. What happens if restore EFS fails?

I just dont understand why this has sat for 2 years.

bigbiff commented 5 years ago

There are many different reasons that this happens. The majority of issues are caused by corrupted files over time on flash cards/emmc. We recommend you pull the backup files off the phone to make sure they don't get corrupted over time. We can probably do more defensive code practices, but there are only 2 of us who work on TWRP part-time. Please feel free to provide help at https://gerrit.omnirom.org/ on the android_bootable_recovery project.

The other issue is that these issues are never open to help us debug, but only to vent. It's hard to do anything without simple debugging logs.

oregszun commented 4 years ago

Does not related to card/emmc. Tried different cards and backup restore at once. Even tried the same card in a Hammerhead with TWRP backup/restore and was working. This one is a device related issue, I think!

Assadginem commented 4 years ago

This error has been a long time, for me it no longer occurs in the latest version of TWRP.

Still happening

ahmedmoselhi commented 4 years ago

same error on rmx1851 ** solved by unmounting firmware during restore process

HaleTom commented 4 years ago

Can confirm that this is not a hardware issue:

I did a low-level factory restore (factory image including partitioning data), and after that I can now backup /data again.

My guess is it's something to do with multi-user profiles and/or lock screen PIN / pattern. (Both of which I had previously).

Is anyone experiencing this WITHOUT multi-user and NEVER having used a lock screen PIN / pattern?

CaptainThrowback commented 4 years ago

Restoring an FBE device with multiple users is a known issue. The fix for it is pending review on Gerrit.

And there's already an open issue for it (note that this current issue is closed): https://github.com/TeamWin/Team-Win-Recovery-Project/issues/1373

aiamuzz commented 4 years ago

the fact that this is being discussed since 2017 and is still being discussed ... seems like a notorious issue !!!

I faced this issue on a day old backup ... now stuck with no way to restore back to it !!!

I tried to boot into the latest version using fastboot boot twrp-3.3.1-0-oneplus3.img directly downloaded from the official website ... and tried restoring the day old backup created on the version twrp-3.1.1-x_blu_spark_v37-op3_op3t.img ... but even this latest version failed with the above error !!!

Now i have flashed the latest version twrp-3.3.1-0-oneplus3.img after flashing my device with stock ROM... but given the inability to restore the day old back up ... i think i am suffering 'TWRP - PTSD' ... no confidence at all that my next attempt to backup and restore will succeed !!!

FYI ... i backed up on the version of twrp-3.1.1-x_blu_spark_v37-op3_op3t.img ... and restoring system paritition and data partition have been encountering the above error !!!

I guess my "TWRP - PTSD" will continue until someone here reassures this bug is fixed for good !!!

whitedavidp commented 4 years ago

aiamuzz - in an effort to address the PTSD, which I also faced on my Android TV box, have you tried the tools from BuildingAtom and venerjoel99 mentioned above? In my case, they allowed me to identify the problems and "fix" them. I put "fix" in quotes because I am not 100% sure that everything was 100% restored perfectly. But I did have completed restores and now have a working system.

After my PTSD passed with a successful restore, I added a number of validation steps that are run after my TWRP backups complete. They, in part, use the tools mentioned above. But I am also performing checks of the hashs and tar listings to /dev/nul looking for errors. My routines email me if something fails since my backups run middle of the night.

All of this said, I think that my problem was induced by an at least partially bad SD card onto which backups were being written. I tested the thing over and over on Windows using SD card test tools which reported no problems. Finally I swapped SD cards and the problems went away (in my case meaning that my backup validation routines stopped reporting errors). The faulty SD card went into the trash after I smashed it with a small mallet.

BuildingAtom commented 4 years ago

I haven't been very active since being in college and studying engineering happens to be very time consuming, but I wanted to comment and reiterate a few details to clarify the specific issue and case that venerjoel99 and I wrote our tools to address.

TL;DR - Error 255 can mean any number of things. Our programs specifically solve the problem of extra junk data being inserted into the backup files. My program is crap now, and venerjoel99's is good -> (https://github.com/venerjoel99/TarProject)

I'll be writing off the top of my head, so sorry if I make any mistakes. Anything related to how TWRP works itself is pretty much going to be more speculation than not since I haven't checked.

As far as I have been able to tell, backups of different partitions/locations result in one of two different types of files being created. Either tar files are created, or binary images are created. When creating a system image, boot, or recovery backup, TWRP creates a bit-by-bit copy of those partitions. When creating a data or system backup, TWRP creates either one or a set of tar archive files depending on how large the backup needs to be. The specific problem we aimed to solve was the issue of extra data being inserted into the tar-based backup files. More details on the data inserted and possibly why in this specific comment and some following.

Tar as a file format works with files following a 512-byte block size. It expects everything to abide by this 512-byte block size, which means that headers containing metadata for each file, and the data for each file must start some multiple of 512-bytes, which also means that the data for each file will have null bytes appended to the end to bring it to a size that is some multiple of 512 if needed (The actual size of the file is stored in the header). And to mark the end of a file, two consecutive empty blocks are used to signal the to the extracting program that there are no more files in the archive.

The problem here is that with these strings being leaked occasionally, data within the file gets offset and whatever extracting program is trying to read the file suddenly gets lost and throws an error. In the minor amounts of testing we did, we found that data never gets lost; it's just that data gets added into the file, and specifically in-between these data blocks. Thus, roughly speaking, the programs we wrote skim along these boundaries and look for specific strings that we know were leaked and then deletes them to realign the files. If the programs manage to complete without error and the restore without errors, then the backups have almost certainly been cleaned.

Important to note is that at the time, MD5 hashes were generated after the files were created. Thus the MD5 hash only worked to tell if the files became corrupted over time. If the files were created corrupted, which is the issue that venerjoel99 and I set out to automate a solution to, the MD5 hashes were effectively useless for restore purposes. Of course knowing if the corrupted file was further corrupted is important, because if it becomes any further corrupted, then our programs mostly likely won't help either.

For files that have become corrupted over time due to SD/eMMC corruption, there is unfortunately not much that can be done after the fact. While it may be possible to manually extract specific files, not only would it be hard to determine if the integrity of those files were maintained, but it would also take a ridiculous amount of time given the number of files.

Since I haven't had as much leeway to maintain and improve on my program as much as I would like, I fully recommended venerjoel99's TarCleaner instead. I currently have an issue on my program which I unfortunately have yet to fully finish addressing (though I will get to it someday).


When this first happened to me, I was so stressed by that fact I couldn't restore from the backup I had just made before factory resetting my phone, I eventually ended up exploring the backup files in HxD which led to effectively the same discovery as was shared way up at the top of this issue. I manually cleaned the files then, but it was running into that problem and having literally no other existing solutions work that led venerjoel99 and me to create our programs. So while I totally feel the "TWRP - PTSD," knowing more about the problem and how it happened makes me a lot less worried.

aiamuzz commented 4 years ago

aiamuzz - in an effort to address the PTSD, which I also faced on my Android TV box, have you tried the tools from BuildingAtom and venerjoel99 mentioned above? In my case, they allowed me to identify the problems and "fix" them. I put "fix" in quotes because I am not 100% sure that everything was 100% restored perfectly. But I did have completed restores and now have a working system.

Yes ... the system failed to restore throwing up the same error !!!

aiamuzz commented 4 years ago

After my PTSD passed with a successful restore, I added a number of validation steps that are run after my TWRP backups complete. They, in part, use the tools mentioned above. But I am also performing checks of the hashs and tar listings to /dev/nul looking for errors. My routines email me if something fails since my backups run middle of the night.

oh ... but i guess its well over my pay grade to be able to do all that ...

All of this said, I think that my problem was induced by an at least partially bad SD card onto which backups were being written. I tested the thing over and over on Windows using SD card test tools which reported no problems. Finally I swapped SD cards and the problems went away (in my case meaning that my backup validation routines stopped reporting errors). The faulty SD card went into the trash after I smashed it with a small mallet.

OH !!! i hope not cause my phone does not have SD card its a 128 GB inbuilt ... so mallet is not really an option !!!

whitedavidp commented 4 years ago

After my PTSD passed with a successful restore, I added a number of validation steps that are run after my TWRP backups complete. They, in part, use the tools mentioned above. But I am also performing checks of the hashs and tar listings to /dev/nul looking for errors. My routines email me if something fails since my backups run middle of the night.

oh ... but i guess its well over my pay grade to be able to do all that ...

i can try to get it together + post it somewhere. let me know.

All of this said, I think that my problem was induced by an at least partially bad SD card onto which backups were being written. I tested the thing over and over on Windows using SD card test tools which reported no problems. Finally I swapped SD cards and the problems went away (in my case meaning that my backup validation routines stopped reporting errors). The faulty SD card went into the trash after I smashed it with a small mallet.

OH !!! i hope not cause my phone does not have SD card its a 128 GB inbuilt ... so mallet is not really an option !!!

seems the issues can occur for other reasons. so hold the hammer!

aiamuzz commented 4 years ago

All of this said, I think that my problem was induced by an at least partially bad SD card onto which backups were being written. I tested the thing over and over on Windows using SD card test tools which reported no problems. Finally I swapped SD cards and the problems went away (in my case meaning that my backup validation routines stopped reporting errors). The faulty SD card went into the trash after I smashed it with a small mallet.

OH !!! i hope not cause my phone does not have SD card its a 128 GB inbuilt ... so mallet is not really an option !!!

seems the issues can occur for other reasons. so hold the hammer!

hahahaha ... @whitedavidp

just to test the same ...

  1. I have upgraded the twrp version to the latest official available directly from the website.
  2. I have made 2 backup's on this latest version twrp one on internal storage and the otheron a sdcard(over OTG)
  3. I now know to collect log in case i encounter the issue

fingers crossed ... the next time i have spare time ... i'll experiment and post my findings and logs

May be i will try to restore the system img that throws up this error and collect its log's so one of you experts can figure out we it happened on my previous unofficial custom version of twrp.

whitedavidp commented 4 years ago

Just to follow up on my earlier post...

I automate regular backups using OpenRecoveryScript:

now create an openrecovery script in /cache/recovery

chmod 777 /cache/recovery echo "set tw_storage_path /data/media/0" > /cache/recovery/openrecoveryscript echo "backup DSBCR DailyBackup" >> /cache/recovery/openrecoveryscript chmod 777 /cache/recovery/openrecoveryscript /system/bin/reboot recovery

So TWRP gets control and runs the backup script which then reboots when complete. I have Tasker run a script after boot completed to check/test the backup that was created. Here are the salient fragments. First, I check the md5 files (modify if your device has different hash):

check_md5() { for f in *.md5; do echo "checking md5 for $f" md5_twrp=$(cat $f) md5_result=$(md5 $(basename $f .md5))

if [ "$md5_twrp" != "$md5_result" ]
then
  echo "Possible md5 problem: $md5_twrp -- should be -- $md5_result"
fi

done echo "Completed check_md5" }

If any problems here, I just quit and send an error email to myself. Otherwise, I go on to check the .tar files themselves. First I do a simple test:

check_tar() { result=0

for f in ext4; do if [ -e $f.md5 ] then echo "checking tar for $f" tar -tf $f > /dev/null if [ $? -ne 0 ] then echo "$f has a bad tar result" result=1 fi fi done

echo "Completed check_tar" return $result }

If any problems here, I just quit and send an error email to myself. Otherwise, I go on to check the .tar files in more detail using the fore-mentioned TarCleaner utility and use the utility to attempt a "fix":

check_tar2() { result=0

for f in ext4; do if [ -e $f.md5 ] then echo "checking tar leakage for $f" /data/local/tarcleaner $f 2> /dev/nul if [ $? -ne 0 ] then echo "$f has tar leakage" leaking_tars="$leaking_tars $f" result=1 fi fi done

for f in $leaking_tars do echo "attempting to fix leaked tar $f. leaked file has .leaked appended" mv $f $f.leaked /data/local/tarcleaner $f.leaked $f 2> /dev/null

if [ $? -eq 0 ]
then
  echo "re-checking fixed tar $f"
  /data/local/tarcleaner $f 2> /dev/null
  if [ $? -ne 0 ]
  then
    echo "fixed tar $f still has problems"
  fi
else
  echo "tar $f was not fixed"
fi

done

echo "Completed check_tar2" return $result }

Obviously, if errors here I send an email to myself and give up.

I learned long ago that a backup that is not verified as completely as possible is hardly a backup at all. At this point, I cannot think of anything else to do.

Again, many thanks to the authors of the fore-mentioned utilities. They have saved my bacon!

CaptainThrowback commented 4 years ago

I believe this patch is under review, and should address some of these concerns: https://gerrit.omnirom.org/34077

oregszun commented 4 years ago

I have new findings trying to do manual tar backup in the TWRP/ADB terminal: (chagallte 3.3.1-0)

Maybe az exFat handling issue?

FFF88 commented 4 years ago

Hi I may be late, but I am facing the same problem I have done a full backup with the command (via ADB in the PC, when in the phone there was TWRP 3.11-0)

adb backup --twrp

But, when I try to restore the backup via the command

adb restore "path/to/backup.ab"

In the phone TWRP gives me the LOG:

Full SELinux support is present. MTP Enabled Done. command is: 'adbrestore' and there is no value Restoring 3rdmodem... [3rdmodem done (1 seconds)] Wiping Cache Formatting Cache using make_ext4fs... Restoring Cache... extractTarFork() process ended with ERROR: 255 E:ADB Restore failed. Done processing script file

The backup file is of 9gb but I format the phone before, so I have much free space. I'm not too used in coding, so I am here to ask if there is a fix to this problem, considering that I cannot do tests like "backupping only system works" because I had wiped system, cache, data since I had the backup.

Is there someone who could help me? Thanks in advance!