martymac / fpart

Sort files and pack them into partitions
https://www.fpart.org/
BSD 2-Clause "Simplified" License
230 stars 39 forks source link

Meaning of 0th partition file in different modes is confusing #36

Closed alexhunsley closed 2 years ago

alexhunsley commented 2 years ago

This isn't a bug, but a confusing seeming design.

In '-s' mode, a '0' numbered partition file seems to be always created, to hold files too big for the specified -s size, even if no files were too big. So if the 0 numbered partition file is empty, fpart managed to partition everything ok, and you have to ignore the partition 0 file.

Compare to '-n' mode (i.e. that number of partitions with data split as evenly as possible between them): partition file 0 is just a regular partition file containing file names. It being non-empty is NOT an error condition.

Am I understanding this correctly? Is there anyway to make the 'overflow' partition file for '-s' mode not be the 0th partition file? e.g. a flag could be introduced that means the overflow files partition file is called something unique.

martymac commented 2 years ago

Hello Alex,

Thanks for your feedback.

Yes, your are understanding correctly and that behaviour is by design. It is explained in fpart(1), see description for option '-s'.

The problem is: when you crawl the files and want partitions with a maximum size set, you never know if every single file you will encouter will be able to fit. As fpart's job is to ensure no file is left out, it has to put it somewhere. Special partition '0' has been chosen because it allows to have a fixed -and known- partition number for such cases and that partition is the only one that can have its size > the size you have chosen with option -s. No option is provided to change that 'special partition' number.

There is no such behaviour with option -n because it is not needed as you do not limit the size of produced partitions. As a consequence, you are right, when using option '-n' partition 0 has no special meaning.

If the presence of that partition is a problem when it is empty, it could be removed in a second pass. That idea is already in the TODO list, see: https://github.com/martymac/fpart/blob/master/TODO#L23. I may work on it but, to be honest, this is not a high priority feature right now.

I'll close that issue for now. Feel free to re-open it if needed :)

Best regards,

Ganael.

alexhunsley commented 2 years ago

Hi Ganael, thanks for the explanation.

Can I suggest a scheme that would make the output more consistent? How about if partition 0 is only used for files that are too big? And then partitions 1 and up contain the files that were ok. This way, the output of fpart is always interpretable and understandable without needing to know what flags fpart was run with, and there is no ambiguity.

You see, I'm writing a script that uses fpart and it feels strange that the part of my script that uses the output from fpart has to worry about whether I passed -n or -s etc into the fpart command. Ideally, that detail would be irrelevant, the output data would stand on its own.

It's also foreseeable that I'd store the output of fpart to come back to later. With the current use of partition 0, its contents can mean one of two things, and I don't think it's possible to tell which if you don't have the original invocation handy. It seems to be a bit of arbitrary complexity where it's not needed, if that makes sense.

alexhunsley commented 2 years ago

If you don't feel that's a change worth making, I'll probably fork the repo and have a shot myself! I appreciate that if you just made this change to the default behaviour it would break backwards compatibility.

martymac commented 2 years ago

Hello Alex,

I am not sure to understand what you mean exactly. Currently, I think that fpart's output already stands on its own because you can easily skip partition 0 if it is empty (and if it is the case, you know for sure that option -s has been used). If it is not empty, you have to take into account every partition output in your consumer program if you want to reach all files.

Adding a partition 0 for option -n would create a bucket that would never be used in that mode and would seem odd to users. Moreover, you wouldn't be able to guess afterwards if option -s or -n has been used neither. It seems to me that it would complicate code and only shift the problem.

Also, as you mentioned, it would break backwards compatibility with existing tools.

Maybe the easiest way to clean up empty partition 0 would be to add a second pass and remove it if empty ? That way, you would get a really consistent output, but I am not sure if that's what you want exactly...

hjmangalam commented 2 years ago

HI Ganael, I think that Alex is commenting on fpart's behavior with large files in that partition zero (P0) contains both files < the partition size as well as files that are too large. If P0 was either empty or contained ONLY files that were larger than the partition size, it would be easier to handle. ie, if P0 was size zero, ignore and go to P1; if P0 is nonzero, all the files named in it are >partition size and may need special processing.

Coincidentally, I was just starting to consider the best way to handle huge files in parsyncfp when I saw this exchange. If fpart partitioned them separately (as in storing ONLY large files in P0), it would make things easier for consumer programs. Otherwise, you have to stat all the files in P0 to check their size. Is there a way around this?

Or am I getting this wrong?

In my hands, the fpart output of a scan thru a large dir (with -L) that contains several too-large files results in a P0 that contains several smaller-than-partition-size files as well as the larger-than-partition-size files. This is with fpart v1.2.1.

$ fpart -v -L -s 10737418240 -o /home/hjm/splits/f /home/hjm/Downloads Examining filesystem... Filled part #0: size = 12191462505, 996 file(s) Filled part #1: size = 10818845370, 5744 file(s) Filled part #2: size = 4497794387, 10775 file(s) 17515 file(s) found.

and without the '-L' flag, P0 is empty, even tho there are files that are much larger than 10737418240 (10M) (Linux ISOs) $ fpart -v -s 10737418240 -o /home/hjm/splits/f /home/hjm/Downloads Examining filesystem... 17515 file(s) found. Sorting entries... Part #0: size = 0, 0 file(s) Part #1: size = 10737418240, 3379 file(s) Part #2: size = 10737418240, 1135 file(s) Part #3: size = 6033265782, 13001 file(s) Writing output lists... Cleaning up...

Is the above behavior expected? Or have I done something wrong? Best

Harry

On Sat, Dec 18, 2021 at 2:16 PM Ganael Laplanche @.***> wrote:

Hello Alex,

I am not sure to understand what you mean exactly. Currently, I think that fpart's output already stands on its own because you can easily skip partition 0 if it is empty (and if it is the case, you know for sure that option -s has been used). If it is not empty, you have to take into account every partition output in your consumer program if you want to reach all files.

Adding a partition 0 for option -n would create a bucket that would never be used in that mode and would seem odd to users. Moreover, you wouldn't be able to guess afterwards if option -s or -n has been used neither. It seems to me that it would complicate code and only shift the problem.

Also, as you mentioned, it would break backwards compatibility with existing tools.

Maybe the easiest way to clean up empty partition 0 would be to add a second pass and remove it if empty ? That way, you would get a really consistent output, but I am not sure if that's what you want exactly...

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-997292186, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZLMSZIOWT37UC7IH3URUB2BANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Harry Mangalam

martymac commented 2 years ago

Hello Harry,

OK, I think I better understand the problem.

There are two cases.

In non-live mode, fpart with -s option uses partition 0 only for files bigger than the given partition size (just 'because you have to put them somewhere'). No partition can exceed the max size given, except partition 0.

In live mode, fpart's behaviour is different as it produces partitions on the fly and does not cache them (so we really talk about a single partition : the current one). The given -s flag is used to check whether the given max partition size has been reached ; it cannot be as strict as in non-live mode as, again, you have to put the current file somewhere (and it has to be the current partition), so the max size is more informational and may (will, in fact) be exceeded.

(as a side note: I should probably add details about that in the man page)

That behaviour in live mode would be hard to change as we would have to generate partition 0 and cache it to finally produce it at the end of the run (no other choice as you don't know what files you will encounter during FS crawling ; the last one could be a huge one). As live mode has been designed to allow starting syncing the file tree while fpart is running you would have to sync all those big files in a single run at the end of fpart pass. That's probably not a good idea :/

On the other hand, non-live mode output could be fixed by always numbering output partitions numbers from 1, except when option -s is used and partition 0 contains files. In that case, a partition 0 could appear (containing only big files), with option -s only.

Alex, is that what you meant ? Harry, what do you think ?

alexhunsley commented 2 years ago

Hi Ganael,

Yes, Harry has given a good example of what I’m talking about: the ambiguity of any files listed in a partition 0.

I can’t speak to how live mode would be impacted by any change, as I’ve never used it yet, but I’m a definite supporter of the idea of “regular” partitions starting at index 1 and reserving partition 0 for only files that were too big.

If it reduces ambiguity, you could also rename partition 0 in a second pass, if non-empty, to be e.g. files.overflow instead of files.0. Just to make it very clear it is not just a regular partition like the others.

hjmangalam commented 2 years ago

Hi all, I tend to agree with Alex - even with live mode (what I use the most), it would be very useful to reserve P0 for too-large files and populate the chunk files starting at 1 with less than chunk-size files. That way we could start the processing of the 'normal' files at 1 and then make the choice as to how to process the bigfiles either after processing the regular files or processing them in a thread or fork as they are logged to P0. In my case, when fpart finishes and no more chunk files are being written, then I can go to P0 and do whatever I need to do to finish things off with the bigfiles. Thanks for your consideration and ... Merry Xmas! Stay safe! Harry

On Sun, Dec 19, 2021 at 3:13 AM Alex Hunsley @.***> wrote:

Hi Ganael,

Yes, Harry has given a good example of what I’m talking about: the ambiguity of any files listed in a partition 0.

I can’t speak to how live mode would be impacted by any change, as I’ve never used it yet, but I’m a definite supporter of the idea of partitions starting at index 1 and reserving partition 0 for only files that were too big.

If it reduces ambiguity, you could also rename partition 0 in a second pass, if non-empty, to be e.g. files.overflow instead of files.0. Just to make it very clear it is not just a regular partition like the others.

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-997373253, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZKPEEOXA56RLSRUSDURW44JANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

martymac commented 2 years ago

Hello,

Thanks for your feedback.

Harry, I get the idea but it seems a bit odd to me to put big files in a dedicated partition in live mode. The original idea of that mode was to go fast, not cache anything and act as quickly as possible on generated file lists. That would break that paradigm because partition 0 would have to be cached, and would only be complete at the end of the run. As a consequence it means that fpart handlers for partition 0 would only be triggered at the end of the run too. If you end up with a really huge partition 0 (think about someone using '-s 1k'... nearly all files would end up in there) you would have to start acting on it (start a sync for example) only after FS crawling, which is the worst possible scenario.

For the consumer part, that would also complexify the code as you would have to handle that special partition manually and probably re-code a splittin g scheme while fpart can already do a good part of the job.

That's why I think that current handling, if not perfect, is a good balance between simplicity (KISS) and efficiency.

Anyway, if you think it's really necessary to act on big file separately, another -simpler- approach could be to add an option to just exclude files bigger than max partition size and log them to stdout (even with option -o enabled, to avoid caching an additional partition), leaving the consumer program do whatever it wants with that. That would probably be a compromise and would better fit fpart's original design. But I am still not conviced this is something we want for live mode.

Anyway, I got the idea : for both modes, I will start numbering regular partitions from 1. Partition 0 may appear only in non-live mode, when option -s is used and it contains files. I'll put that on the TODO list and work on it ASAP.

Merry Christmas to both of you,

Ganael.

hjmangalam commented 2 years ago

Hi Ganael, Yes, I think your solution to keep the current behavior but with the too-large files sent to STDOUT is good - I can certainly live with that approach. If you could also send the bytesize along with the fully qualified file name separated by a tab, that would be perfect - that way I can decide whether it's worthwhile doing the extra processing or just pass it thru as is. Thanks again for your consideration and feedback.

Harry

On Mon, Dec 20, 2021 at 2:11 PM Ganael Laplanche @.***> wrote:

Hello,

Thanks for your feedback.

Harry, I get the idea but it seems a bit odd to me to put big files in a dedicated partition in live mode. The original idea of that mode was to go fast, not cache anything and act as quickly as possible on generated file lists. That would break that paradigm because partition 0 would have to be cached, and would only be complete at the end of the run. As a consequence it means that fpart handlers for partition 0 would only be triggered at the end of the run too. If you end up with a really huge partition 0 (think about someone using '-s 1k'... nearly all files would end up in there) you would have to start acting on it (start a sync for example) only after FS crawling, which is the worst possible scenario.

For the consumer part, that would also complexify the code as you would have to handle that special partition manually and probably re-code a splittin g scheme while fpart can already do a good part of the job.

That's why I think that current handling, if not perfect, is a good balance between simplicity (KISS) and efficiency.

Anyway, if you think it's really necessary to act on big file separately, another -simpler- approach could be to add an option to just exclude files bigger than max partition size and log them to stdout (even with option -o enabled, to avoid caching an additional partition), leaving the consumer program do whatever it wants with that. That would probably be a compromise and would better fit fpart's original design. But I am still not conviced this is something we want for live mode.

Anyway, I got the idea : for both modes, I will start numbering regular partitions from 1. Partition 0 may appear only in non-live mode, when option -s is used and it contains files. I'll put that on the TODO list and work on it ASAP.

Merry Christmas to both of you,

Ganael.

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-998308033, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YYV43HL4UF4TQVGU5TUR6SYPANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

alexhunsley commented 2 years ago

Ganael, That sounds good! Always starting the regular data at partition 1 definitely will make the output easier to consume.

Merry Christmas both!

martymac commented 2 years ago

Hello,

Harry, I've added suggested changes for live mode to the TODO list. I'll work on that ASAP.

Thanks again to both of you for your feedback !

(I'll leave that issue open for now)

hjmangalam commented 2 years ago

Thanks very much, Ganael. A really nice Xmas gift. Harry

On Wed, Dec 22, 2021, 8:10 AM Ganael Laplanche @.***> wrote:

Hello,

Harry, I've added suggested changes for live mode to the TODO list. I'll work on that ASAP.

Thanks again to both of you for your feedback !

(I'll leave that issue open for now)

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-999694276, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YYOTUPMHCUATFHL36LUSH2ATANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

martymac commented 2 years ago

Hello,

I've pushed a first update that makes fpart start numbering partitions at '1' instead of '0', as requested.

Could you try it and tell me if it fits your needs ? Future updates will come to skip empty partition '0' as well as to allow excluding too big files when option -s is used, but I still have to work on that.

Cheers,

Ganael.

hjmangalam commented 2 years ago

Thanks for the quick turraround, Ganael, It's an early release, so the full docset is not expected, but for general release, you may want to include instructions on how to compile the released code to an executable. Most ppl who will use fpart can figure this out for themselves, but for those who don't...

For me, this works on Linux: 'git clone https://github.com/martymac/fpart.git; 'cd fpart; aclocal; automake --add-missing; autoconf; ./configure; make -j4'

As always, builds without ANY errors or even warnings. I'll try integrating the build later today. Best wishes harry

On Wed, Jan 12, 2022 at 3:14 AM Ganael Laplanche @.***> wrote:

Hello,

I've pushed a first update that makes fpart start numbering partitions at '1' instead of '0', as requested.

Could you try it and tell me if it fits your needs ? Future updates will come to skip empty partition '0' as well as to allow excluding too big files when option -s is used, but I still have to work on that.

Cheers,

Ganael.

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-1010930573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZ5KTXD3OKTR4JOY4LUVVPCJANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

hjmangalam commented 2 years ago

Hi Ganael, Just starting to play around with it. I see that it generates files starting with '.1' just fine as expected, but I'm not seeing any STDERR output that lists files too large for the partition. ie in a dir that had many files larger than the partition size (50m), I only see regular partitions.

However, this may be a misunderstanding on my part: from the latest man page, I see:

LIVE MODE -L Live mode (default: disabled). When using this mode, partitions will be generated while crawling filesystem. This option saves time and memory but will never pro‐ duce special partition 0 (see option -s ). As a conse‐ quence, it can generate partitions larger than the size specified with option -s. This option can be used in conjunction with options -f and -s, but not with option -n.

I thought that when using '-L' with a '-s' size constraint, fpart would skip the too-large file, writing the size and path to STDERR so the user could use that info to process it as needed. This may need an additional flag to cause that behavior tho. The default acts as in the current man page; a '-j' (an option letter not yet used) will cause the larger files to be skipped as I describe.

In my case, I want to evaluate large files and (if too large) split them into partition sized chunks so they can be processed in parallel along with the rest of the files. This is so you don't have a few huge files keeping the process busy when the rest of the chunks have long-finished. If you wanted to include that option within fpart, I would not object ;). Best Harry

On Wed, Jan 12, 2022 at 3:14 AM Ganael Laplanche @.***> wrote:

Hello,

I've pushed a first update that makes fpart start numbering partitions at '1' instead of '0', as requested.

Could you try it and tell me if it fits your needs ? Future updates will come to skip empty partition '0' as well as to allow excluding too big files when option -s is used, but I still have to work on that.

Cheers,

Ganael.

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-1010930573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZ5KTXD3OKTR4JOY4LUVVPCJANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

martymac commented 2 years ago

Hello Harry,

Thanks for your feedback.

Regarding building fpart from source, it is explained here :

https://www.fpart.org/#installing-from-source

As precised in my previous message, I still have to work on two changes :

so please be patient, I'll work on that ASAP :)

Best regards,

Ganael.

hjmangalam commented 2 years ago

Hi Ganael, Sorry to have jumped the gun.;) I appreciate your efforts. H

On Thu, Jan 13, 2022, 3:00 AM Ganael Laplanche @.***> wrote:

Hello Harry,

Thanks for your feedback.

Regarding building fpart from source, it is explained here :

https://www.fpart.org/#installing-from-source

As precised in my previous message, I still have to work on two changes :

  • skip partition 0 if it is empty
  • provide an option to exclude (and print) too big files when in live mode and option -s is used

so please be patient, I'll work on that ASAP :)

Best regards,

Ganael.

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-1012027837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3YZV7UQDMY2E625ZKJLUV2WFZANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

martymac commented 2 years ago

Hello,

I've pushed the missing bits. It adds option -S (Skip big files). Skipped files will appear immediately (stdout) as belonging to a special partition 'S' (as in 'S'kipped). I hope that will match what you were looking for.

I'll close that PR for now, any feedback welcome :)

(and thanks again for your suggestions, helping fpart getting better!)

hjmangalam commented 2 years ago

Hi Ganael, Thanks for your time.. I think I see what you did - now everything is sent to STDOUT & left to the user, which I could deal with, but I was hoping that the current '-S' flag would continue to allow placement of files that make up chunks smaller than the defined partition into the files defined by the '-o' flag, and ONLY the files > partition size would be output to STDOUT to be captured and used by the user. Or better, sent to a special partition file (as you allude to above).

So a option set like this could be used: fpart -L -S -s 200m -o ~/chunks/f ~/Downloads which would result in the usual set of partition files in [~/chunks/f*], with the larger files being streamed to STDOUT. Or the -S option could also result in a special file named [~/chunks/f.S] using the above -f option flag. which would contain all the skipped files in the format: size/path/to/file. This would make fpart-using code transition trivial and also allow easier addition of 'bigfile' processing options. Is there internal fpart logic that makes this problematic? I think it would just involve a test to check if the current file was > partition size, and if so, append the size and name to another (single) file in the same place as the rest of the partition files. Thanks again Harry

On Tue, Jan 18, 2022 at 1:58 PM Ganael Laplanche @.***> wrote:

Hello,

I've pushed the missing bits. It adds option -S (Skip big files). Skipped files will appear immediately (stdout) as belonging to a special partition 'S' (as in 'S'kipped). I hope that will match what you were looking for.

I'll close that PR for now, any feedback welcome :)

(and thanks again for your suggestions, helping fpart getting better!)

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-1015871328, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3Y5HMQIYV6PSEPNAFR3UWXO7TANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

martymac commented 2 years ago

Hi Harry,

Sorry, this is a mistake, there's no point in forbidding printing excluded files when option -o is used. This is fixed (I don't know what I had in mind!).

Regarding the other part of the question, as already discussed, sending skipped files to a specific partition file could be feasible but would require more work and I don't know if it would make sense in live mode where partition files have to be created fast. It would break that paradigm.

Also, you would end up with a huge partition file, where triggers would probably not make sense neither (so we would have to skip them for that partition, introducing an exception in their handling).

Printing them to STDOUT as skipped makes more sense to me ; as I wrote before I think it is a good balance between simplicity and efficiency, while respecting the global spirit of the tool.

Cheers,

Ganael.

hjmangalam commented 2 years ago

Thanks Ganael, Happy to contribute complaints that you have to address. ;) I can live with the way it's handled now (STDOUT vs a specific file).

My use of the live mode is admittedly specific to me and you have a better idea of the general use cases.

My one comment on the current output of '-S' is that it's a bit noisy: .. S (1902116864): /home/hjm/Downloads/isos/lmde-4-cinnamon-32bit.iso ^ ^ ^^ .. is there a reason for the 4 characters marked above except for debugging? ie, why not:

1902116864/home/hjm/Downloads/isos/lmde-4-cinnamon-32bit.iso

the '-v' output goes to STDERR so it can be filtered and that leaves a nicely segregated file

$ ~/bin/fpart-1.4.1 -vvv -L -S -s 200m \ -o ~/fpart/Fs/f ~/Downloads > chunk.exceptions (lots of verbose STDERR output)

$ head -5 chunk.exceptions S (327155712): /home/hjm/Downloads/RT-contents.ibd.most-of-rt-database.data S (492830720): /home/hjm/Downloads/isos/debian-11.1.0-i386-netinst.iso S (2009333760): /home/hjm/Downloads/isos/Fedora-Workstation-Live-x86_64-35-1.2.iso S (3204448256): /home/hjm/Downloads/isos/kubuntu-20.04.3-desktop-amd64.iso

so rather than perform a couple of complex regex splits (OK, not very complex), all you need is simple (and usually default in many languages) split on whitespace.

Thanks! harry

On Thu, Jan 20, 2022 at 3:44 AM Ganael Laplanche @.***> wrote:

Hi Harry,

Sorry, this is a mistake, there's no point in forbidding printing excluded files when option -o is used. This is fixed (I don't know what I had in mind!).

Regarding the other part of the question, as already discussed, sending skipped files to a specific partition file could be feasible but would require more work and I don't know if it would make sense in live mode where partition files have to be created fast. It would break that paradigm.

Also, you would end up with a huge partition file, where triggers would probably not make sense neither (so we would have to skip them for that partition, introducing an exception in their handling).

Printing them to STDOUT as skipped makes more sense to me ; as I wrote before I think it is a good balance between simplicity and efficiency, while respecting the global spirit of the tool.

Cheers,

Ganael.

— Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/36#issuecomment-1017414484, or unsubscribe https://github.com/notifications/unsubscribe-auth/AASF3Y3PXFZGACA6OT2TFUTUW7YRXANCNFSM5KJP3DWA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>

--

Harry Mangalam

martymac commented 2 years ago

Hello Harry,

Skipped files are printed in the same way standard partitions/files are. I understand your concern as it has been chosen at the beginning of the project and never tuned since. I've never had feedback about that ; you're right there may be a simpler display format that could be used.

As that request may not be that urgent and is not specific to the special partition (if it should be changed, let's change it also for other partitions), can you open a separate bug report ? I'll work on it a bit later...

Best regards,

Ganael.