tueda / formset

A "form.set" generator.
MIT License
2 stars 0 forks source link

refactor: take parameter scaling out of main() (issue #2) #4

Closed magv closed 3 years ago

magv commented 3 years ago

Here's the proposed solution to issue #2. Compared to the diff I attached there, I've renamed search into scale and made it into a class member of Setup, which seemed a bit more appropriate (although I have no preferences either way). I've tried to make it so that the current CLI behavior is fully preserved, although my testing wasn't thorough.

The lowest_scale parameter is needed (by secdec) to define the fallback configuration. For the moment I plan to set it to 1.0, so that parameters are only scaled up; formset.py CLI assumes it to be 0.0, but that isn't a suitable fallback for secdec, because it would set SmallSize, LargeSize, etc, all to zero, and Form will likely fail with cryptic messages. In this case I'd prefer it to fail with "not enough memory", hence 1.0.

The second return value of scale is needed by main() to detect insufficient memory and fail if so. Otherwise I don't care for this value that much, and would prefer it to be gone. I considered using a check like sp.scale(2GB).calc() < 2GB in main() instead, but at least in theory it can be unreliable, since bisection might produce configuration that uses 2GB+1B of memory. The probability of this is almost zero, but I wouldn't want to risk it anyway.

Note that I'm not at all attached to the API as it exists: if you'd like to rename scale, move it around, drop that second return value, etc -- this will be completely fine by me, and now is the best time to think about it.

magv commented 3 years ago

That lint failure says:

Run gitlint
1: CT1 Title does not start with one of build, chore, ci, docs, feat, fix, perf, refactor, revert, style, test: "Merge c993bcabc5526784c28c85c2e7cb69d1e5e0838d into 658af6a7d1aed6d83e994471bb2b0491300e75fb"
1: CT1 Title does not follow ConventionalCommits.org format 'type(optional-scope): description': "Merge c993bcabc5526784c28c85c2e7cb69d1e5e0838d into 658af6a7d1aed6d83e994471bb2b0491300e75fb"
1: T1 Title exceeds max length (92>80): "Merge c993bcabc5526784c28c85c2e7cb69d1e5e0838d into 658af6a7d1aed6d83e994471bb2b0491300e75fb"
Error: Process completed with exit code 3.

Seems like the linter is not friends with how github tests pull requests.

tueda commented 3 years ago

Thank you for your pull request! Indeed, I thought adding a method to the Setup class would be an option.

In this case, I agree that it looks better if we remove the second return value of scale:

    def scale(self, total_memory, lowest_scale=0.0, human_readable=False):
        # type: (int, float, bool) -> Setup

I think it is safe to write the main function as

    sp = sp.scale(memory, human_readable=args.human_readable)

    # Final memory usage we've found.
    memory_usage = sp.calc()

    memory_usage_rounded = memory_usage
    if args.human_readable:
        memory_usage_rounded = round_human_readable(memory_usage_rounded, True, False)

    if memory_usage_rounded > memory:
        parser.exit(
            -1,
            ("failed to find parameters: {0} bytes shortage").format(
                memory_usage_rounded - memory
            ),
        )

in such a way that the memory usage (rounded as same as in f(x)) is checked. (Did I miss something?)

If these are OK, please apply them and re-push with rebasing, like

git remote add upstream https://github.com/tueda/formset.git
git fetch upstream
git rebase upstream/master

git push origin main --force

(Now the gitlint in the CI should ignore merge commits.) Or, if you want, I can merge your PR as it is and then I will make these changes.

magv commented 3 years ago

Done, almost as you suggested.

By the way, your master branch is upstream/main rather than upstream/master.

magv commented 3 years ago

The reason I didn't initially use if memory_usage > memory check in main() is because if max_iterations in scale() is set too low, it might return parameters with slightly larger memory usage, just because bisection didn't yet stabilize on the precise minimum, and the result is still off by a byte or so.

tueda commented 3 years ago

Merged. Thank you for your contribution!

If I correctly understand, you might concern that x1 is larger than the solution if the algorithm can't find any zero points of f(x) and doesn't go into the actual bisection (x2 is None in this case; on the other hand in the actual bisection f(x1) <= 0 <= f(x2)). I think this case is cared by

        miny, minsp = f(lowest_scale)
        if miny >= 0:
            return minsp

Anyway, I think 1B doesn't matter; Setup.calc() has more error on its estimation.

magv commented 3 years ago

If I correctly understand, you might concern that x1 is larger than the solution if the algorithm can't find any zero points of f(x) and doesn't go into the actual bisection

Not exactly. To illustrate my concern:

./formset.py -p 30 -Hu
2460M

But if I change max_iterations to e.g. 5:

./formset.py -p 30 -Hu
failed to find parameters: 91400K bytes shortage

The concern is that scale() may return parameters with more memory usage than requested, even if a configuration with lower memory is possible. If so, main() will fail.

With max_iterations being 50 this concern is more theoretical than practical, but it still makes me uneasy. In any case, my usage is not affected by this issue.

tueda commented 3 years ago

Thanks for the clarification. I hope max_iteration = 50 would be enough to find at least one zero-point for most cases, as 1/2^50 ~ 10^-15, but it's also true that this is not guaranteed to work. Maybe the search algorithm should be improved... (Or just increase max_iteration in future.)