Drewsif / PiShrink

Make your pi images smaller!
MIT License
3.51k stars 645 forks source link

Why is pishrink reports errors that Gpared does not? #228

Closed Jibun-no-Kage closed 1 year ago

Jibun-no-Kage commented 2 years ago

Have an Pi OS 11 image file, that works, no issues. When I read the file via Win32DiskImager or in Linux using dd, no issues or errors. When I write the image file via Win32DiskImager or with dd, no issues.

However when I use PiShrink I get the following error... `pishrink.sh -vspzr image.img image-shrink.img

pishrink.sh: Checking filesystem ... /dev/loop0 is mounted. e2fsck: Cannot continue, aborting.

pishrink.sh: Filesystem error detected! ... pishrink.sh: Trying to recover corrupted filesystem ... e2fsck 1.46.2 (28-Feb-2021) /dev/loop0 is mounted. e2fsck: Cannot continue, aborting.

pishrink.sh: Trying to recover corrupted filesystem - Phase 2 ... e2fsck 1.46.2 (28-Feb-2021) /dev/loop0 is mounted. e2fsck: Cannot continue, aborting.

pishrink.sh: ERROR occurred in line 64: Filesystem recoveries failed. Giving up...

`

If I use Gparted no such error results. What is the difference? Gparted uses... e2fsck -f -y -v -C 0 '/dev/sdb2

Jibun-no-Kage commented 2 years ago

This is really breaking PiShrink, it fails about 90% of the time. I have captured images in Windows, Linux, etc. Used various tools, the images are functional and work fine, rarely do I have any issues, say maybe a bad or old SD card for example. But PiShrink routinely spits out errors that even -r option can't fix. Every time this happens, I pull the image file or even the SD card into Gparted, and can resize the rootfs partition with no issues. Can this please be addressed in some active manner? Could this be related to the long know issue of vFAT/fat32 support being questionable in e2fsck? I keep seeing references to long standing know bugs with such. Also, PiShrink appears to try to validate /boot partition? Why is this being done? When the goal is to resize just rootfs? Even Gparted seems to hate the /boot partition. How about an option to not try to touch the /boot partition?

Drewsif commented 2 years ago

I've not ran into this issue. I will look into the options gparted uses for e2fsck, I think I just picked what I thought was sensible but it could be overly restricting. I know the version that shipped with Ubuntu was not working for a while so it might be worth checking if you can get an updated version of e2fsch.

Related to it validating /boot, I don't think it should be trying to verify it at all unless it's the last partition. What leads you to believe its inspecting it?

Jibun-no-Kage commented 2 years ago

Yeah, when I realized that gpart was not failing even though it can't process boot given it is vfat, that is snag in some way, that suggest that pishrink might be having a similar issue. I think the only partition that you should shrink is rootfs right?

adbarbosa commented 1 year ago

I have de same problem.

paul-ridgway commented 1 year ago

I believe the underlying issue is that the loop device is mounted:

/dev/loop0 is mounted.
e2fsck: Cannot continue, aborting.

If you comment out e2fsck it then gets stuck resizing because the partition is mounted and online resizing is not supported.

I found this issue after switching to KDE which automounts - if you turn this off in the Removable Device settings it appears to work for me.

Jibun-no-Kage commented 1 year ago

I noticed that if pishrink aborts for some reason, the loop device clean up does not seem to complete consistently, sometimes clean up is done, sometimes not. Not sure if this is a design issue or not, I have not had a chance to review the code to determine.

framps commented 1 year ago

Not sure if this is a design issue

I don't think pishrink was designed. It's just a nice hack which works for a lot of people but unfortunately fails for a lot of other people. An open source project should be maintained - but @Drewsif doesn't maintain pishrink. Said this I think @Drewsif should announce pishrink is no longer maintained to make sure everybody is on the same page.

paul-ridgway commented 1 year ago

It doesn't always clean up, no - but that's not giving rise to the issue here as new loop devices are assigned dynamically

framps commented 1 year ago

There are 18 outstanding PRs since 2017 ... @Drewsif published his hack and now he's no longer interested in any maintenance of pishrink. Not well done :-((((((

paul-ridgway commented 1 year ago

That's pretty backwards, calling it a hack and then expecting it to be maintained considering anyone who cares could fork it and take over....

framps commented 1 year ago

Glad you agree with me and call pishrink a hack. And I agree with you: Nobody can expect a hack to be maintained. But then @Drewsif should document this in the README:

pishrink works in a lot of environments but it's not bulletproof. There is no maintenance and if it fails don't create a PR. Create a fork and fix the issue.
paul-ridgway commented 1 year ago

Lol, I don't agree with what you say, that's the point. Software maintenance is often not quick or cheap

framps commented 1 year ago

Software maintenance is often not quick or cheap

Agree with you. But people spent their precious time to create PRs to help to improve pishrink but these PRs are ignored. In addition people spent their precious time to create issues which are not worked on.

Some clear statement in the README will make sure everybody knows about this and doesn't spend any time to improve pishrink - at least not in this repo but maybe in his fork.

monsieurborges commented 1 year ago

Dudes what's going on with PiShrink?

I've seen a lot of issues and PR with no answers or solutions... I even created a PiShrink Dockerized version, and it's a shame it wasn't integrated into the project...

I think it's time for us to join forces and start collaborating more!

@Drewsif I'll see if I can spend some time fixing some issues and sending some PR. What do you think about this?

framps commented 1 year ago

I think it's time for us to join forces and start collaborating more!

I agree with you. But the target repository shouldn't be this repository. I have no confidence in @Drewsif .

Drewsif commented 1 year ago

So I am still alive and here. I don't really work on this project anymore but I do pay attention. The issue is a lot of the issues opened are either problems I can't replicate or are rather generic issues that I can't act one. I do not do a good job of cleaning them out as I try to follow up to see if I can get something to replicate but most of the time I can't.

The PRs take time for me to review and test. Ive had PRs that looked fine and then broke as soon as I merged them in if I didn't test well. People have been using PiShrink on systems I never expected with no issues, which is amazing and I love seeing, but trying to support every use case and system is quite the task and time intensive. The scope creep also gets real with some of them where I think they are too system specific or belong in a supporting script for that particular use case.

Ignore @framps, he got mad because I didn't merge one of his PRs because I wanted to do it the right way. Took a bit to do that, so he claimed he was done working on PiShrink and made a dramatic goodbye post then shows up to be pissy in issues. I just blocked him so we can keep the conversations productive.

I do appreciate the people who help others with issues and submit PRs or Issues. I do get to them very slowly as this is a hobby for me and not a full time job. I tend you randomly pick it up, make sure nothing is super broken, and go through the lists of issues and PRs. I'm open to more collaboration for sure, I'd love to find devs that I can trust to help out.

paul-ridgway commented 1 year ago

Ha, nice one @Drewsif. In any case, I appreciate PiShrink and I'm sure many others do - I'm sure you've saved people many many hours not having to figure their own solution out, and while it could always be better (all software can) it doesn't have to be 'perfect' either - I modify it for my needs any anyone who knows what they are doing can work their way around issues.

I think you can close this one - it looks host OS automount behaviour which you could try and catch but I'm not sure how much can be done to actually mitigate it since umount was a step too far and there's no losetup / e2fsck / resize flags that will ignore that its mounted...

monsieurborges commented 1 year ago

@Drewsif I suggest you enable GitHub Discussions, so some specific topics and settings can be documented and updated by the community.

About the PRs, maybe we will need to create a test workflow to validate the changes, and templates to get more details on how to collaborate and ask for help.

Just doing marketing for one of my projects, I have a repo, raspberry-pi, with all the details on how to get started with Raspberry Pi. It might be useful to add references between the projects.

Documentation => Clone and shrink a Raspberry Pi image

ghost commented 1 year ago

So I am still alive and here. I don't really work on this project anymore but I do pay attention.

Glad you're still alive. I frankly didn't expect you to follow any activity on your repo any more.

The issue is a lot of the issues opened are either problems I can't replicate or are rather generic issues that I can't act one. I do not do a good job of cleaning them out as I try to follow up to see if I can get something to replicate but most of the time I can't.

That's where a swarm can help ;-)

The PRs take time for me to review and test. Ive had PRs that looked fine and then broke as soon as I merged them in if I didn't test well. People have been using PiShrink on systems I never expected with no issues, which is amazing and I love seeing, but trying to support every use case and system is quite the task and time intensive.

I see your point and that's why I tried to help you. It's amazing to see people use the Windows Linux Subsystem (WSL) to run pishrink to shrink their images they created with win32diskimager, Rufus and other Windows SD card imager tools.

The scope creep also gets real with some of them where I think they are too system specific or belong in a supporting script for that particular use case.

I just run Linux systems and also cannot debug or recreate any issues reported like you. That's why pishrink should check for supported environments and reject all unsupported environments.

Ignore @framps, he got mad because I didn't merge one of his PRs because I wanted to do it the right way. Took a bit to do that, so he claimed he was done working on PiShrink and made a dramatic goodbye post then shows up to be pissy in issues. I just blocked him so we can keep the conversations productive.

I got mad because I tried to help you but I didn't get any feedback from your side. I felt like somebody knocking on the door to help but neither the door was opened nor there was any response or feedback.

I didn't get mad because you didn't merge my PR. A PR is just a code change proposal and is supposed to be be reviewed, discussed and either updated and then merged or just rejected if the PR does not fit into the existing code. I got mad because I didn't get any feedback.

I got mad because I got the feeling you don't work on this project any more. That's OK. A lot of people sooner or later decide to no longer support their project because of various reasons. But then these guys document this in their README and the community can decide how to continue to maintain the project. Usually another fork is used to continue on this project.

I got mad because 18 people spent their time to create PRs to make pishrink more robust or add additional features but they are are not worked on.

I got mad because I see 50 issues and no feedback. I understand it's a hard job to work on all of them and even mostly impossible to fix them if they are issues you don't have the environment or miss detailed debug information. So why didn't you call for help on this? As far as I understand there are folks willing to help - including me - but you didn't accept the help. Looks like you now changed your mind and you are now open for collaboration :-)

I do appreciate the people who help others with issues and submit PRs or Issues. I do get to them very slowly as this is a hobby for me and not a full time job. I tend you randomly pick it up, make sure nothing is super broken, and go through the lists of issues and PRs. I'm open to more collaboration for sure, I'd love to find devs that I can trust to help out.

I frankly don't think a full SD partition backup is the right backup strategy. There are Linux tools out there which are much faster and reliable. But I see mostly Windows users like the Windows backup tools win32diskimager, Rufus et al. and pishrink helps them to keep their images small. They can easily restore the image on a Windows system and don't have to bother with Linux. So it makes sense to make pishrink more robust.

pishrink works for a lot of people very nice but unfortunately they start to use pishrink in environments which you didn't thought about and thus pishrink lacks checks to refuse these unsupported environments or changes to support these environments.

Would be great if this repository can still be used to bundle collaborative efforts to improve and make a much more stable pishrink. But I think given you don't really work on this project any more this project will get a boost if you create an organization and grant the merge role to other users you trust. That way PRs and issues will be worked on much faster and the number of open PRs and issues will hopefully decrease. Other folks will be able to review PRs and comment on them. They don't need any merge role. In addition an agreed collaboration practice will help. I mean, how many folks should review a PR and who finally merges the PR. Also a discussion area how to improve pishrink independent of any PRs and issues will be very helpful.

From my point of view there are two major things to work on in pishrink asap:

1) Add additional debug statements such that it's possible to see what's the environment pishrink runs on. This will at least give a chance to understand what's the root cause for a failure and reject an issue with "unsupported environment" 2) Add additional checks for all commands whether they succeed or fail. A lot of errors are just caused because previous commands fail because they don't get the expected results from previous commands.

You don't have to block this id I just created to reply to you comment above. I will delete the id shortly. Either unblock me (framps) and I can collaborate again with my github id or keep me blocked and I will shut up forever.

monsieurborges commented 1 year ago

Hey guys,

@Drewsif I see your point and understand you, but I believe you should back off your decision to block @Framps (@knirhsip).

Emotions aside, I see him as a good contributor to the project and you have good reviews and points to improve thanks to the criticisms made by him.

You can do it! Unlock him and let's contribute to this project to evolve with the help of the community.

By the way, I loved the idea of rewriting it in Python. But first let's improve it on the script level ;)

Jibun-no-Kage commented 1 year ago

I can help test, etc. as well. I have a fair amount of python experience. Used to write python scripts for ESXi servers, and Raspberry Pi devices of course, mostly leveraging sensor data capture and MQTT related.

Drewsif commented 1 year ago

My decision stands for now, I love the enthusiasm behind the project and it's that positive attitude that makes me want to continue work. Having people come in and be negative dicks will not be tolerated and only serves to hinder my will to work on this project.

Related to the technical issue at hand, I'm closing this in favor of #254.