diskfs / go-diskfs

MIT License
517 stars 113 forks source link

ext4 support #9

Open the-maldridge opened 5 years ago

the-maldridge commented 5 years ago

This looks like a pretty amazing library and looks like an ideal package to form the basis of my installer for @void-linux. I'd need ext4 though to do this (and someday I'd love to see support for more exotic filesystems, even it if meant shelling out to their tools.

Do you foresee ext4 landing any time soon?

deitch commented 5 years ago

Thanks @the-maldridge . I will admit that it has been quite a labour of, well, masochism. :-)

void-linux is interesting. I was motivated to start diskfs from my work on linuxkit. There are only 2 places there where we need to exec out to an external program; building filesystems is one of them. These are all bits in a file/on a disk, so I said the 5 most dangerous words any engineer can say: "how hard could it be?"

Do you foresee ext4 landing any time soon?

I got pretty far with ext4. If I am not mistaken, reading works fully. The difficulty is writing, and particularly some of the areas that use tree structures. Rebalancing them (or even figuring out how to build them initially) got bogged down. That led to as switch to immutable filesystems first, since they are much easier to write. Hence iso (which is complete) and squashfs (which is mostly so). I do intend to return to ext4 once squashfs is done.

CanyonCasa commented 5 years ago

Would the gexto package be of any use in helping you along with ext4 support?

deitch commented 5 years ago

It really might, thank you @CanyonCasa . I can look at either integrating it, or just leveraging the few parts that got blocked. MIT and Apache licenses largely are compatible, so that is good.

deitch commented 5 years ago

It is at a state where both squashfs and ext4 are working read but not write. Time to get it completed.

probonopd commented 5 years ago

I'd be highly interested in squashfs read and write, and in zisofs read and write.

deitch commented 5 years ago

I as well @probonopd :-)

Just got to find hours to work on it.

Mikkelhost commented 3 years ago

Hey, i stumbled upon your repo! I am looking for a way to edit a raspberry pi image using go. So i was wondering how the ext4 support is coming along? :) Or maybe you could point me in a direction that could make sense for editing these types of images using scripting

deitch commented 3 years ago

Slowly. 😆

ext4 reading works, you can check out the ext4 branch. Writing was about halfway done last I saw it. I would love to give it a month and finish it up. But real life, as they say. Every now and then, someone needs it for business purposes and invests.

Mikkelhost commented 3 years ago

It is totally fine haha no pressure, we all gotta do what we gotta do! Played with the library a little yesterday, and found that i could use the existing functionality to write to the boot partition of the raspbian image :D

So just wanna say great job on this man!

vtolstov commented 3 years ago

gentle ping about ext4 support... also question how much money needs to be donated to get read/write support for ext3/ext4 ?

deitch commented 3 years ago

Hi @vtolstov . Can you email men directly avi [at] aptimia [dot] com?

gaurav-gogia commented 2 years ago

Hey @deitch,

I found this library very recently. It looks pretty cool. I think its going to be very helpful to me. My use case doesn't involve any writing so I was going through different issues to see if it supported file systems other than FAT32

Soo, did you have time to implement read-only support for NTFS or EXTn or ExFAT?

Please advise

Thanks!

deitch commented 2 years ago

Soo, did you have time to implement read-only support for NTFS or EXTn or ExFAT?

ext4 remains in process. I think the read-only works, might be able to publish it if I could some time among business work. NTFS has not been done, nor has exFAT. It really is just a question of time. We have a pattern and process in place for getting it done, but it is just long hard work.

very helpful to me. My use case

What are you looking at?

gaurav-gogia commented 2 years ago

We have a pattern and process in place for getting it done, but it is just long hard work.

Understandable

What are you looking at?

I wish to read a disk image, parse it and read all the files/folders inside of it. I don't want to mount that disk image. Also, while I would prefer working on it using GoLang. Similar solution in some other programming language with maybe a better library support is also fine by me

deitch commented 2 years ago

some other programming language with maybe a better library support is also fine by me

I have had dreams of also having rust and other libraries for this for quite some time. It isn't all that hard. The hard part is reverse engineering enough of the filesystem and/or implementing the specification sufficiently to have a working r/w library. Once it is done in one, it is just work to do it in another.

We now have both qcow2 and ext4 more or less halfway done (read, not write). I wouldn't object to someone submitting another filesystem or disk format as a PR at all, but my focus - when I can find time - has to be on finishing those. As this is all OSS and not (yet) corporate-backed, it is catch-as-catch-can (unless/until a real user has sufficient value to pay to move them ahead).

gaurav-gogia commented 2 years ago

@deitch

I see, that makes sense. Well, thanks for putting in the initial effort and creating this library in the first place. :)

prologic commented 2 years ago

Hey @deitch also came across your library and reading through this issue I have a couple of questions:

a) Have you found the time/bandwidth to get back on this lately? b) What about we reduce the scope to just "creating" the file-system structures so that they can be mounted?

deitch commented 2 years ago

Have you found the time/bandwidth to get back on this lately?

Not really. Feel free to contact me offline to discuss.

What about we reduce the scope to just "creating" the file-system structures so that they can be mounted?

Do you mean read-only? That is mostly done. Read-write is the hard part, especially ext4. IIRC, the inode structures involve a tree that sometimes needs to be rebalanced, lots of other elements that change when you write anything.

I would love to get this moved ahead.

prologic commented 2 years ago

Just wanted to update this issue.

I've forked this repo here and merged @deitch 's ext4, unfortunately due to my using the latest version of Go, the branch a few years old and my own stupidity, I had to remove a lot of code I could not get compiling and had no idea about 😅

My use-case is specifically for creating filesystem structures on a block device (such as a disk) -- ala equivilent of mkfs.extN. cc @andig where we also talked about this in https://github.com/nerd2/gexto/issues/7

I'm a bit stuck though and need some help! I haven't figured out what the minimum set of fields are to populate in the Superblock and Group Descriptores here:

https://git.mills.io/prologic/go-diskfs/src/commit/a011c6ff1c6f7020d38fb5aa9753b4e73599eb6a/filesystem/ext4/ext4.go#L252-L258

If anyone can help 🙏

deitch commented 2 years ago

The ext4 branch doesn't even actually compile right now. Lots of stuff in mid-edit. It probably could be locked down with several hours of work ("how hard could it be?" 😆) so that at least the read-only and mkfs parts would work. It was the writing that was most challenging. But I am not completely sure that is correct.

prologic commented 2 years ago

@deitch Yeah 😅 I managed to get something to build but yeah needs more work 😂

deitch commented 2 years ago

I just did an insane amount of lint cleanup on everything but ext4 branch, and then rebased ext4 to it. Also cleaned up some testing errors (i.e. errors in the tests themselves).

I really would like to get ext4 working. Do you mind sharing your changes to get the build working?

prologic commented 2 years ago

I just did an insane amount of lint cleanup on everything but ext4 branch, and then rebased ext4 to it. Also cleaned up some testing errors (i.e. errors in the tests themselves).

I really would like to get ext4 working. Do you mind sharing your changes to get the build working?

Yes! But I nuked a whole bunch of stuff I didn't understand and couldn't fix 😅

See here: https://git.mills.io/prologic/go-diskfs

I'm not proud of it, and it obviously doesn't work (yet) 😅

prologic commented 2 years ago

Been also learning the Ext2 filesystem and trying to port pyext2 to Go (which doens't look like a lot of code to wrap my puny head around 😂)

gaurav-gogia commented 2 years ago

@deitch @prologic

Maybe this library in C/C++ can help in updating support for file systems like EXT2/3/4?

Sleuthkit's file system library seems to support many file systems and many types of disk images.

Disk images supported: raw, vmdk, vdi, vhd, e01, aff

File systems supported: ntfs, fat32, ext4, ext3, ext3 etc..

https://github.com/sleuthkit/sleuthkit/tree/develop/tsk/fs

lapubell commented 1 year ago

i too would be interested in ext2 support. I have a client that is looking at this for a dev project, and creating an ext2 filesystem for a USB device is step #1.

I really have no idea how big of a lift that is, as I mostly work in the web dev scene. But I have to imaging that no need for journaling would make it easier? Anywhere I can assist I'd be happy to!

prologic commented 1 year ago

@lapubell I would have a look at our implementation that is used by the GoNix project.

deitch commented 1 year ago

Welcome @lapubell

ext2 vs ext4 is not all that different, if you ignore journaling. If I recall correctly, the last time I was deep into it, I got bogged down on the inodes and rebalancing them.

@prologic did you get your fork working? Is there something you can upstream?

lapubell commented 1 year ago

Yeah, this level (low) of programming is something that is pretty new to me. I'm way more used to pushing bytes over TCP/IP than pushing them around on disks. I don't even know how to balance inodes, let alone rebalance them.

Anyone got a good intro to this topic for me to read? I got some time coming up between the 25th and the 1st and this is all rather interesting.

Also, @prologic your implementation link is 404ing to me. Maybe it's a private repo?

deitch commented 1 year ago

Yeah that was a bad link. Try this https://git.mills.io/prologic/go-diskfs

deitch commented 1 year ago

Check my ext4 branch. I should have links there.

prologic commented 1 year ago

Yeah I haven't cleaned up our fork yet (sorry for the bad link), but yeah it all works nicely for creating ext2 filesystems.

deitch commented 1 year ago

If it works nicely, let's get it in? Partial working is better than nothing. If I recall correctly, it is based partially off of the existing ext4 branch, so it should only be an improvement.

deitch commented 1 year ago

creating

is it able to write and update too?

prologic commented 1 year ago

@deitch Actually our implementation of ext2 filesystem creation support is a direct fork of a python implemtnation. There wasn't much inspiration from the ext4 branch at all 😢

deitch commented 1 year ago

Ah, oh well. If you feel like updating the ext4 branch with your work, it would be great.

prologic commented 1 year ago

I probably will (with time and all that)

mheese commented 1 year ago

@deitch is there any movement on this?

From how I understand this thread, what is working already is creating the filesystem and reading from it. It would be nice to already just have this functionality merged maybe?

For myself I'm actually only interested in the "create filesystem" and "open filesystem" (for reading the label) parts. So I will fork it for now and probably put something together from the ext4 branch and/or `prologic's work to suit my needs.

deitch commented 1 year ago

From my memory, yes, that is correct. The branch itself is non-functional because it has a lot of structures that are in the middle of the editing process, but those easily can be commented out or moved to a new branch.

I would much appreciate a PR on it.

mheese commented 1 year ago

@deitch yeah, there was a lot of cleanup to get this to compile. I cleaned it up, you can take a look at the branch here: https://github.com/githedgehog/go-diskfs/tree/hh-ext4

Trying to use it though, the Create() doesn't work because it runs out of disk space

panic: Error writing Superblock for block 335544320 to disk: write /dev/loop6: no space left on device

I think the calculations are probably off, and there might need to be some more details filled out in the superblock. I wouldn't call myself an ext4 expert :)

And trying to use the Read() function of a filesystem which was created with mkfs.ext4 fails in the checksum calculations ... both for the superblock as well as the group descriptors. Commenting these out though I'm able to get the filesystem and at least the label is being read correctly.

So the TL;DR, there seems to be a lot of work left. Not sure I'll deal with this tbh, as I'm a little bit under time pressure right now. And calling out to mkfs.ext4 and some other tool to get the label is not the end of the world for me. It would be definitely really nice though to have this.

prologic commented 1 year ago

If you're okay with using Ext2 we built a version of this that currently works for creating the file system 👌

mheese commented 1 year ago

@prologic @deitch ok, maybe I'll take a crack at this again and compare the create methods... I'm sure the problems left aren't that hard anymore to solve

deitch commented 5 months ago

Have a look at #218

Help is very much appreciated.

deitch commented 4 months ago

ext4 reading is complete and merged in #218. I'm sure people will find and report issues, but this is a huge step.

There is a branch ext4-write based on that, but it's mostly a few changes to three files, using structs that don't exist anymore. Maybe it will help, maybe it won't.

aol-nnov commented 2 weeks ago

Hey, @deitch !

Glad you implemented ext4, really impressive! Library grows since I've stumbled upon it, nice to see!

May be you could share ext4-write branch or something like a top level design for ext4 write feature? I'd like to try and implement it!

My needs are to create disk images with ext4 partitions. Files for that partition are in tarball, different file types, symlinks mostly, but other special files are present, too. So there is a need to create files of types other than dirFileTypeRegularor dirFileTypeDirectory, but in the current interface I can not see any means for that...

deitch commented 2 weeks ago

Hi @aol-nnov. Thanks. Work on it when we can. Someday, it might even be complete. Probably exactly when go no longer is used anywhere. 😆

ext4-write branch doesn't exist anymore. Whatever was there is merged in. As you noticed, you already can create directories and regular files. I happily would take a PR for creating symlinks as well as device files. I would follow something similar to os.Symlink(), unix.Mknod(). It already knows how to read them.

iso9660 and squashfs already support them, but those are read-only filesystems, created based on a directory on disk, so no calls to create them.

And, of course, Create() already exists for ext4 as well. So not much else is needed.

aol-nnov commented 2 weeks ago

Thank you for the prompt response, @deitch

I would follow something similar to os.Symlink(), unix.Mknod().

Yeah, right. The main question is which interface should I place them to...

deitch commented 2 weeks ago

Part of ext4.Filesystem, I would think, parallel to OpenFile() and Mkdir() and Stat(), etc.

Is there a better place?

aol-nnov commented 2 weeks ago

Oh, okay.. I just tried to find out if you plan to bring those methods to filesystem.FileSystem interface.

So, you propose not to expose them in the common interface, right?

deitch commented 2 weeks ago

Ah, that was what you meant.

I was not sure about that. One the one hand, these only exist in the context of some filesystem types. On the other hand, the intended approach is to use both disk.GetFilesystem() and disk.CreateFilesystem() to get a generic filesystem.FileSystem and use that to read and write everything. There aren't all that many functions that might not be implemented in others, so why force people to cast types?

I think you are right. We should do this in 2 steps:

  1. A PR that extends filesystem.FIleSystem interface to include Symlink and Mknod, and extends all existing types to make it an error.
  2. A PR that enables those in ext4