Closed tuliom closed 2 years ago
I'm leaving this as draft for now, because I haven't tested it fully yet. I'm asking myself if deflate should be modified as well.
I already mentioned this to Tulio. But as a reminder to self and for record keeping.
Examine the variables target_sz vs target_sz_expected about these 3 lines.
https://github.com/libnxz/power-gzip/blob/master/lib/nx_inflate.c#L1180
https://github.com/libnxz/power-gzip/blob/master/lib/nx_inflate.c#L1199
https://github.com/libnxz/power-gzip/blob/master/lib/nx_inflate.c#L1223
Line 1093 is touching unlimited number of pages which would may lead to the page faulting issue.
It seems it is better to touch target_sz_expected pages than target_sz.
Because in line 1062 the code is limiting the output to inflate_per_job_len.
Let me clarify: I'm rewriting this PR and I'm changing exactly how target_sz
and target_sz_expected
are affecting the algorithm.
I have incorporated @abalib 's comments. However, while looking at the code I found an issue (commit Fix the total amount of pages being touched
) and I realized it was necessary to change the code that manipulate the DDE by limiting the amount of input/output sent to the NX instead of just limiting the amount of pages touched.
In other words: if we send too much input/output, we have to touch all of their pages, otherwise we'll have to pay the price for treating ERR_NX_AT_FAULT
.
If we send too little, we decrease performance.
So, I decided to use the values in source_sz_expected
and target_sz_expected
plus an increase in the amount of output whenever that's available in order to avoid excessive ERR_NX_DATA_LENGTH
.
Unfortunately this PR has now 6 commits. Commits 2, 4 and 5 are just reorganizing/improving code. I suggest reviewers to take a look one commit at a time.
I also think this is ready for review now.
So, I decided to use the values in
source_sz_expected
andtarget_sz_expected
plus an increase in the amount of output whenever that's available in order to avoid excessiveERR_NX_DATA_LENGTH
.
Looks right conceptually
@mscastanho I updated the patches with the fixes that you requested.
Before inflating a file, all the input and output pages are touched. However, when this file is too large, there is no need to touch all its pages, just as many as we can fit in a single NX job.
By not doing this, we end up dealing with too many page faults than necessary for each NX request.