natevw / fatfs

Standalone FAT16/FAT32 filesystem implementation in JavaScript
47 stars 13 forks source link

Writing to empty files causes fatfs to write to negative sectors. #36

Open asivery opened 1 year ago

asivery commented 1 year ago

I've encountered a problem, where if you write to a file, whose first cluster is set to "0" (non allocated), makes this library treat cluster "0" as a valid cluster, then writes to it. That later causes it to write to negative sectors on the disc. Instead, when writing to files which haven't yet been allocated a cluster, a new cluster should be allocated, then the FAT should be updated.

natevw commented 1 year ago

Thanks for the report! I'll try take a look at this sometime — if you have even just some failing tests that could be incorporated via a draft PR that might be helpful as it's been a while since I worked on my own code here :-) But your description is clear and it makes sense where the problem likely is.

asivery commented 1 year ago

@natevw Thanks for the response - sorry for the delay, I didn't notice you responding. I'll try to make some tests that would prove this, for now I've written a workaround. In my fork of this repo I've added a function that can reallocate a file - give it a new cluster to work with. For now I'm just calling this function before I call open, if stat reports that the file's size is 0. It's a far-from-great solution, but it works for now...

natevw commented 1 year ago

No worries! Glad you've got something going meanwhile. Since I was thinking about it again I figured I'd go look to see if the bug was obvious, and… well… nothing in this code is obvious to me anymore 🤣

I suspect it must start somewhere within https://github.com/natevw/fatfs/blob/be952ae0aef2d6a1ccd9bbe7e34a3574f952bdee/chains.js#L66-L81 where basically instead of respecting 0 as being an empty chain of sectors it somehow goes off and smashes the actual 0 sector itself. There could be a similar problem on the read side, unless it gets shortcut or funnels through the same abstraction as the write path.

It's the abstractions I wish I had documented; I don't even remember if these "chain" things are vocab from the FAT spec itself or just something past-me made up to encapsulate some patter I saw at the time.

asivery commented 1 year ago

Yeah, I'm not really sure how the code works as well, the abstractions make it somewhat hard to work with, especially for people who are seeing it for the first time. I've spent some time trying to figure out where the problem lies, and somewhere here would be a good place to implement the solution: https://github.com/natevw/fatfs/blob/be952ae0aef2d6a1ccd9bbe7e34a3574f952bdee/chains.js#L223-L237 firstCluster is available to all methods within a 'chain'. If it's equal to zero, it means a new one should be allocated, and the FAT should be updated. Unfortunately I wasn't able to implement that myself. For reading this is not a problem, as stat returns the file is 0 bytes long in that case, so it won't read anything either way - it doesn't matter that the position the "data" is located at is illegal.

Also, there seems to be some weird caching going on. Even after executing the function I wrote, I have to recreate the fatfs instance, as it caches that old firstCluster value. I've also added a rename function (it only supports renaming files without changing the directories, which made the implementation pretty simple), and I'm worried that the caching will cause it to break if you rename a file to the old name of a previously renamed file. If you could help me disable the caching, or make a function that would flush the caches that would be awesome as well.

EDIT: Changed "TOC" to "FAT" - working with Minidiscs too much causes you to think of everything as a minidisc :sweat_smile:

natevw commented 1 year ago

firstCluster is available to all methods within a 'chain'. If it's equal to zero, it means a new one should be allocated, and the FAT should be updated

Excellent tip — that's where I was struggling myself to figure out even where the 0 that needs to be handled better was getting tracked.

I'm hesitant to promise any particular timeline of when I'd be able to really dive back into this, but I'm hopeful that the logic for brand new files and/or files that don't have enough blocks clusters allocated to them can get pulled into service for this currently-unhandled case too somehow.

natevw commented 1 year ago

So I'm wondering what happens if this code:

https://github.com/natevw/fatfs/blob/ea53bdc4e5482ade19d346c21b38ae27de59beee/chains.js#L107-L111

± just initializes itself with cache = (firstCluster) ? [firstCluster] : [] and whatever makes sense for the .firstCluster assignment (which maybe needs to be reviewed more thoroughly for how that's used).

It's probably not quite as simple as that, but basically updating the code to handle an empty list of clusters seems like the first step and then working outwards to find any parts of the code that assume the list isn't empty and fixing those.