laktak / chkbit

Check your files for data corruption
MIT License
115 stars 7 forks source link

Feature request: Possibility to store everything in one file instead of one file per folder #22

Open AxelPetermann opened 1 month ago

AxelPetermann commented 1 month ago

I would like to have the option to store everything just in one ".chkbit" file instead of one file per folder. Why?: Because on Windows I use a Software which monitors some folders and I can't tell this Software to ignore the ".chkbit" file.

alexkallai commented 1 month ago

I'd also welcome such a change, it would make this tool really versatile, in its current state I wouldn't use the application, since I don't want to "litter" my whole data structure with all the files. (also, it makes backing up the data even slower, since small file copy is rather slow)

Though in my opinion the approach should be different: Instead of json, the single file should be an sqlite3 file, which would make all transactions faster in a bigger dataset.

laktak commented 3 weeks ago

I've updated the FAQ with more details for this question:

That's why I did not consider a central index.

huyz commented 4 days ago

https://github.com/ambv/bitrot takes the centralized approach but it doesn't seem very actively maintained

laktak commented 3 days ago

I had an idea for a simple solution that does not require a lot of changes to the code.

It's missing some cleanup to remove delete directories from the index but you can test the binaries from the prerelease-artifacts in https://github.com/laktak/chkbit/actions/runs/12039051476

  --index-db                use a index database instead of index files

This places a single .chkbitdb in the current directory.

AxelPetermann commented 3 days ago

Thanks for still having a look into this.

I did a quick test and got the following error:

EXC .chkbitdb: Binary was compiled with 'CGO_ENABLED=0', go-sqlite3 requires cgo to work. This is a stub

I've tested with the following command:

chkbit.exe "D:\test for chkbit" --update --index-db

laktak commented 3 days ago

I removed the flag from the build process. It should work now though I can't test on windows. Please use https://github.com/laktak/chkbit/actions/runs/12052712326

AxelPetermann commented 2 days ago

Thanks again, but exact same error occurs.

$ chkbit.exe --version
github.com/laktak/chkbit
a3af97f8a4d7c1faa927eb5bee21b3586e1d2010
laktak commented 2 days ago

Thanks for your help. I was unfamiliar with cgo but it has to do with cross compiling and go would automatically enable it on the source os, which is why I never saw this on Linux. I will look for an alternative to sqlite because it won't allow me to cross compile.

gstjee commented 1 day ago
  • if it is damaged, only one directory is affected

To have same benefit of above and get rid of multiple index files in sub dirs, In regards of central index file we can have 3 index files in root dir like below and some other suggestions : index1: current index file index2: backup of current index file index3: last(older) index file( it will be generated when we update index so that we can have for safety like in-case we accidentally updated index)

Some suggested code flow: To ensure index are not corrupted/altered, while validation program can: a. program can have internal check whether index1 and index2 are equal if not then can issue warning and stop further files validation. b. program can have some tag ex. --index-file so that user can force validation with specific index file. in that case check a is not needed.

laktak commented 21 hours ago

If anyone want's to give it a try, I have a new version that can be tested:

https://github.com/laktak/chkbit/actions/runs/12089414524

laktak commented 21 hours ago

@gstjee

In regards of central index file we can have 3 index files in root dir like below and some other suggestions

The indexdb will be placed in the directory that is to be checked. There will be backups available and a json export for long term storage.

* and is current .chkbit is JSON format ? and if possible can we have linear vertical flow of content in that file instead of horizontal ? in vertical fashion it is easy to navigate and compare two index files.

Not really on topic and also not sure what you mean with vertical flow but if you want to view it differently you can use jq to extract data.

gstjee commented 20 hours ago

If anyone want's to give it a try, I have a new version that can be tested:

https://github.com/laktak/chkbit/actions/runs/12089414524

checked on Windows 10 for update, validate and append. seems working fine.

**1. what if during updating index, process get killed by system or user? will it corrupt index database file ? same question for older index files implementation as well ? is there any safety mechanism ?

  1. in index database way: can we check individual sub directories as well ? like we can do in original code.**

.... not sure what you mean with vertical flow...

in Notepad++, .chkbit file's all content is shown in one single horizontal line so it is difficult to edit, check, compare etc. image

laktak commented 19 hours ago

Thanks for the quick feedback. Good to know that bbolt works on Windows.

1 - The plan is to write to a new database on each run and then move the old to a backup/move in the new one, once finished. 2 - yes, just not yet

For notepad - you are asking for a formatted json. You can get that by running jq < .chkbit, see https://github.com/jqlang/jq