ARK-Builders / arklib

Core of the programs in ARK family
MIT License
1 stars 10 forks source link

Refactor `src/index.rs` and Enhance Documentation #87

Closed tareknaser closed 4 months ago

tareknaser commented 5 months ago

Description

This PR focuses on refactoring the src/index.rs file.

Changes

The changes primarily include:

Edit

Fixes #89 with 633335ab42a603fe646b1bfe7b687a271b7ab9b3

github-actions[bot] commented 5 months ago

Benchmark for c3d41c0

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 138.9±1.32µs | 140.1±2.81µs | +0.86% | | compute_bytes_medium/compute_bytes | 27.7±0.39µs | **27.0±0.35µs** | **-2.53%** | | compute_bytes_small/compute_bytes | 127.4±1.20ns | 127.7±2.16ns | +0.24% | | index_build/index_build/tests/ | **157.7±0.66µs** | 161.5±1.33µs | **+2.41%** | | tests/lena.jpg/compute_bytes | **13.3±0.09µs** | 13.5±0.28µs | **+1.50%** | | tests/test.pdf/compute_bytes | **111.1±0.69µs** | 136.8±0.54µs | **+23.13%** |
github-actions[bot] commented 5 months ago

Benchmark for 2848ee4

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 140.5±2.35µs | 141.7±2.54µs | +0.85% | | compute_bytes_medium/compute_bytes | **27.8±0.21µs** | 28.5±0.55µs | **+2.52%** | | compute_bytes_small/compute_bytes | 131.4±1.21ns | 131.7±2.16ns | +0.23% | | index_build/index_build/tests/ | 160.6±1.62µs | **157.4±0.47µs** | **-1.99%** | | tests/lena.jpg/compute_bytes | 13.9±0.13µs | 14.0±0.12µs | +0.72% | | tests/test.pdf/compute_bytes | 111.7±3.01µs | 112.0±0.84µs | +0.27% |
kirillt commented 5 months ago

@tareknaser could you also check this issue?

github-actions[bot] commented 5 months ago

Benchmark for 838e655

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 139.5±0.71µs | **138.6±2.83µs** | **-0.65%** | | compute_bytes_medium/compute_bytes | 28.5±0.21µs | **27.9±0.47µs** | **-2.11%** | | compute_bytes_small/compute_bytes | 131.7±2.80ns | 131.8±2.18ns | +0.08% | | index_build/index_build/tests/ | **162.7±0.48µs** | 171.3±2.25µs | **+5.29%** | | tests/lena.jpg/compute_bytes | 14.4±0.09µs | **13.9±0.06µs** | **-3.47%** | | tests/test.pdf/compute_bytes | 163.7±1.82µs | **111.2±0.34µs** | **-32.07%** |
github-actions[bot] commented 5 months ago

Benchmark for 6a1e945

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 475.3±1.30µs | **145.2±2.21µs** | **-69.45%** | | compute_bytes_medium/compute_bytes | **27.2±0.32µs** | 28.5±0.30µs | **+4.78%** | | compute_bytes_small/compute_bytes | 131.5±2.31ns | 133.0±2.63ns | +1.14% | | index_build/index_build/tests/ | **162.7±0.78µs** | 173.3±0.74µs | **+6.52%** | | tests/lena.jpg/compute_bytes | 13.9±0.06µs | 13.9±0.11µs | 0.00% | | tests/test.pdf/compute_bytes | 115.7±1.85µs | 115.0±0.87µs | -0.61% |
github-actions[bot] commented 4 months ago

Benchmark for 1c39010

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 572.8±1.88µs | **137.4±1.54µs** | **-76.01%** | | compute_bytes_medium/compute_bytes | 27.5±0.45µs | **26.8±0.43µs** | **-2.55%** | | compute_bytes_small/compute_bytes | 128.6±1.53ns | 129.2±3.06ns | +0.47% | | index_build/index_build/tests/ | **157.8±0.82µs** | 171.0±1.28µs | **+8.37%** | | tests/lena.jpg/compute_bytes | 13.4±0.07µs | 13.4±0.09µs | 0.00% | | tests/test.pdf/compute_bytes | 339.3±16.35µs | **108.6±1.37µs** | **-67.99%** |
github-actions[bot] commented 4 months ago

Benchmark for 3ca26c4

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 472.9±1.29µs | **141.5±1.06µs** | **-70.08%** | | compute_bytes_medium/compute_bytes | **26.8±0.22µs** | 27.5±0.20µs | **+2.61%** | | compute_bytes_small/compute_bytes | 129.4±2.68ns | 129.1±0.89ns | -0.23% | | index_build/index_build/tests/ | **158.8±5.51µs** | 173.6±1.02µs | **+9.32%** | | tests/lena.jpg/compute_bytes | 13.3±0.07µs | 13.3±0.17µs | 0.00% | | tests/test.pdf/compute_bytes | 111.1±4.27µs | **110.1±0.47µs** | **-0.90%** |
github-actions[bot] commented 4 months ago

Benchmark for b6ce103

Click to view benchmark | Test | Base | PR | % | |------|--------------|------------------|---| | compute_bytes_large/compute_bytes | 141.5±1.56µs | 141.5±2.31µs | 0.00% | | compute_bytes_medium/compute_bytes | 27.8±0.22µs | 27.9±0.57µs | +0.36% | | compute_bytes_small/compute_bytes | 134.5±1.54ns | 134.9±4.61ns | +0.30% | | index_build/index_build/tests/ | **160.4±0.91µs** | 172.3±0.68µs | **+7.42%** | | tests/lena.jpg/compute_bytes | 13.9±0.07µs | 13.9±0.08µs | 0.00% | | tests/test.pdf/compute_bytes | **114.7±1.19µs** | 117.5±0.39µs | **+2.44%** |
kirillt commented 4 months ago

About change from milliseconds to nanoseconds in timestamps: we should double-check that we really can have nanoseconds on mobile platforms. I don't remember precisely, but it seems we had an issue with nanoseconds due to implicit rounding to milliseconds. Could be so even on Linux.

We should investigate why this threshold was introduced: https://github.com/ARK-Builders/arklib/commit/267b844f3a0f04586c7d6178180607fc3cb0fa0d#diff-701633ed548410624cfcae90d38a986adbd60deb5aeedd92355fe2e7702a9c18L34

tareknaser commented 4 months ago

About change from milliseconds to nanoseconds in timestamps: we should double-check that we really can have nanoseconds on mobile platforms. I don't remember precisely, but it seems we had an issue with nanoseconds due to implicit rounding to milliseconds. Could be so even on Linux.

Before making any changes to arklib on how time is stored, we need to understand the impact of storing time in nanosecond on mobile systems.

If we must keep the millisecond precision, we have two options to consider:

kirillt commented 4 months ago

@tareknaser we could use nanoseconds, but:

  1. Do we really need such a precision? Can several filesystem writes happen with time difference of e.g. 10ns? I hope milliseconds is enough.
  2. We also should care about compatibility between different platforms. I'm not sure that Android provides developers with timestamps better than milliseconds.

I could imagine us needing microseconds to timestamp realistically filesystem events. Would we have the rounding issues in this case?

We can track only milliseconds when retrieving last_modified of each entry. This can be done by converting the SystemTime to milliseconds in scan_entry

This sounds like the way to go.

We can keep the current implementation, which stores time in milliseconds. Only problem is that IndexEntry objects won't be equal when writing to ark file and then reading from it.

Is this sill a problem if we go the way with converting to milliseconds in scan_entry?

tareknaser commented 4 months ago

Do we really need such a precision? Can several filesystem writes happen with time difference of e.g. 10ns? I hope milliseconds is enough.

Yes I think milliseconds is enough. Only issue is https://github.com/ARK-Builders/arklib/pull/87#discussion_r1509775947

Is this sill a problem if we go the way with converting to milliseconds in scan_entry?

No. that should solve it