martymac / fpart

Sort files and pack them into partitions
https://www.fpart.org/
BSD 2-Clause "Simplified" License
230 stars 39 forks source link

Chunk files contain less lines than expected #10

Closed mrzor closed 4 years ago

mrzor commented 5 years ago

Greetings!

I'm using parsyncfp which uses fpart under the hood. The idea is to copy a (pretty huge) /home dir onto /mnt/new_home (which is on a different device, formatted with notably more inodes). There is (afaik) about ~200M files to enumerate, most of them quite small.

fpart -v -L -z -s 10485760 -o /mnt/new_home/zor/fpcache/f //home

In the process of troubleshooting/diagnosing what I deem pretty slow file enumeration (that runs contrary to everything I could read about fpart), I ran into a discrepancy between what fpart log says about my chunks (about 50K files per chunk) and the actual content of the chunk file (less than 1K files per chunk).

At this point I believe this to be some fundamental misunderstanding on my part about fpart and probably not a bug. Would you be kind enough to clarify what is going on?

[root:/mnt/new_home/zor/fpcache] [base] # tail fpart.log.00.57.52_2019-08-10 Filled part #162: size = 10487077, 46953 file(s) Filled part #163: size = 10486100, 44556 file(s) Filled part #164: size = 10485854, 46049 file(s) Filled part #165: size = 10485967, 46284 file(s) Filled part #166: size = 10487488, 46771 file(s) Filled part #167: size = 10485843, 46619 file(s) Filled part #168: size = 10486423, 48616 file(s) Filled part #169: size = 10486659, 46067 file(s) Filled part #170: size = 10485845, 44444 file(s) Filled part #171: size = 10485861, 44997 file(s)

[root:/mnt/new_home/zor/fpcache] [base] # ls -l f.* | tr -s ' ' | sed 's/f\.//' | awk '// { print $9 " " $8 }' | sort -n | tail | awk '// { "wc -l < f." $1 | getline linecount; print "f." $1 ", " linecount " lines, written at " $2 } ' f.163, 1547 lines, written at 04:49 f.164, 638 lines, written at 04:52 f.165, 983 lines, written at 04:55 f.166, 937 lines, written at 04:58 f.167, 1062 lines, written at 05:01 f.168, 456 lines, written at 05:04 f.169, 715 lines, written at 05:07 f.170, 704 lines, written at 05:11 f.171, 695 lines, written at 05:14 f.172, 621 lines, written at 05:18

martymac commented 5 years ago

Hello @mrzor,

That's weird : you should get a single file per line (the line count should match exactly the file count in the partition).

Can you perform a simple 'wc -l' on a specific partition file ? You can also manually check the partition file's contents as it is a simple text file (maybe you can restrict the crawl to a smaller file tree to be able to get simpler results).

Finally, be sure you have not used fpart's option -0 in your manual test. In that case, you would get several files per line in partition files (separated by \0s and only file names containing \n would produce lines that would be counted by 'wc -l'). That could explain what you are observing.

Best regards,

Ganael.

mrzor commented 5 years ago

Hello !

Can you perform a simple 'wc -l' on a specific partition file ?

That's what I did - at first manually, then in a more systematic fashion with my second command (you might notice a wc -l invocation nested inside the awk script at the end of the pipeline)

You can also manually check the partition file's contents as it is a simple text file

I did check them manually at first - they looked too small, hence the issue :)

(maybe you can restrict the crawl to a smaller file tree to be able to get simpler results).

This is something I have yet to explore, I'll do that now. (First order of business was completing the sync, which thanks to some other optimizations eventually completed using rsync 3.x).

Finally, be sure you have not used fpart's option -0 in your manual test.

I'm sure that is not the case - the chunk file had hundreds of line separated records.

Best,

martymac commented 5 years ago

(maybe you can restrict the crawl to a smaller file tree to be able to get simpler results).

This is something I have yet to explore, I'll do that now. (First order of business was completing the sync, which thanks to some other optimizations eventually completed using rsync 3.x).

Yes, please. To help you debug : you can double option -v (-vv) to ask fpart to display files found while crawling the FS. You can then try monitoring a specific partition with (e.g.) 'tail -F /mnt/new_home/zor/fpcache/f.122' and compare the results. You should see the same files.

Also, can you tell me what version of fts you are using (see fpart -V) ? system or embedded ? On which OS ?

martymac commented 5 years ago

Hello,

Have you been able to perform further testing ?

Cheers,

Ganael.

mrzor commented 5 years ago

Hello,

I lost access to the original dataset at least for the time being. I'm working on generating one with similar properties on my own machine, hopefully this will allow for a successful reproduction here. Best,

On Tue, Aug 20, 2019 at 1:01 PM Ganael Laplanche notifications@github.com wrote:

Hello,

Have you been able to perform further testing ?

Cheers,

Ganael.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/martymac/fpart/issues/10?email_source=notifications&email_token=AAANHVSVXOC3OOL7CRTVMCDQFPFJTA5CNFSM4IKYUEL2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4V4VLY#issuecomment-522963631, or mute the thread https://github.com/notifications/unsubscribe-auth/AAANHVVC345GWWWX5NVIXHDQFPFJTANCNFSM4IKYUELQ .

-- e!ie

martymac commented 5 years ago

Hello there,

OK, thanks for the update. I'll be happy to see what's going on here if you can reproduce the case :)

Best regards,

Ganael.

mrzor commented 5 years ago

Hello,

A quick update:

I did complete my case reproducer script - so far no luck reproducing, using fpart built from master (at 05260f2a4f0401c49e13d4d66fb89e04ef1e1e85).

Also no luck reproducing (so far) with the (rather anciant) fpart 0.9.2 that was bundled with parsyncfp.

I'm trying to come up with more experimental parameters that could help reproduce without forcing me to resort to spinning a GCP instance and use an actual hard drive.

martymac commented 5 years ago

Hello ELi,

Thanks for the update. I'll leave that issue open for a while. Don't hesitate to update it if you can reproduce the problem.

Thanks, Best regards,

Ganael.

martymac commented 4 years ago

Hello Eli,

Any news ? Have you been able to reproduce the problem ?

Regards,

Ganael.

martymac commented 4 years ago

I guess that bug report can be closed now.

mrzor commented 4 years ago

Sorry I couldn't be of more help on this.