draperlaboratory / hope-policy-tool

Policy language tools
Other
0 stars 0 forks source link

Look into memory usage #33

Open amstrnad opened 5 years ago

amstrnad commented 5 years ago

Running the policy-tool on a policy uses slightly over a GiB of memory.

It's possible this is reasonable and all of that memory is needed, but it could also be due to lazy computations. Some profiling should give us a better idea.

Sample test case: \time -v policy-tool -d -t entities -m . -o ../policy-engine/policy osv.frtos.main.heap

amstrnad commented 5 years ago

It looks like this might be highly affected by the policies included in policies/osv/frtos/main.dpl

I had two global policies I was using as tests, and when removed from that file, the speed and memory usage were drastically improved. This is even though those test policies were not actively being used.

amstrnad commented 5 years ago

Command being timed: "policy-tool -d -t ../entities -m .. -o ../../policy-engine/policy osv.frtos.main.contextswitch-testContext2"

Measurement baseline 2 additional global policies
User time (seconds) 124.37 612.78
System time (seconds) 40.45 287.44
Percent of CPU this job got 2948% 3187%
Elapsed (wall clock) time (h:mm:ss) 0:05.59 0:28.24
Maximum resident set size (kbytes) 315,492 1,165,800
Minor (reclaiming a frame) page faults 135,268 651,065
Voluntary context switches 262,118 1,182,860
Involuntary context switches 2,690,337 2,317,723
File system outputs 256 256
Page size (bytes) 4096 4096
Exit status 0 0
sbrookes commented 5 years ago

I also recently had an issue with the way policy-tool is parsing main.dpl which I think may be related. I was writing a new policy and so I added an entry for it to the main.dpl equivalent for my policy. However, at some point while it was half-implemented I tried to run another (completed) policy out of my main.dpl and I couldn't, because policy-tool was greedily parsing everything it found from main.dpl and it errored out in my half-done policy.

While the idea of greedily parsing everything in the source tree for the policy you choose does make sense, I see the entire idea of main.dpl as a problem here.

We use main.dpl as a place to centralize composed policies. Issue https://github.com/draperlaboratory/hope/issues/71 should fix that problem.

However, if I remember correctly there's another problem... I don't remember where I saw it, but I think the DPL documentation says something like:

A policy has the form:

policy = (global_pol_1 (& global_pol_2 & ... & global_pol_n)) & pol_1 (& pol_2 & ... & pol_n)

This is why there are non-composite entries in main.dpl that say

rwx = rwxPol

Instead, I should be able to just have rwx.dpl define

rwx = rule ^ rule ^ rule

and then I can use osv.frtos.rwx and don't need another file for indirection which ends up bloated with a ton of policies that I don't need all the time, so it bogs down its parsing time.

ccasin commented 5 years ago

Parsing any file you include is definitely intended and the correct behavior. I wouldn't call this "greedy" - it's just including the stuff you asked for.

As for rwx, I agree this would be nice to support. It's a bit tricky because ^ is used both for global policy composition and for rule composition with priority (your example of top level composition should have ^s in the first part). Currently the way we disambiguate is to assume global policy composition is meant at the top level. Probably we should find a smarter way to disambiguate or just pick a different symbol.

ccasin commented 5 years ago

(It's worth noting that you don't actually need another file in the rwx case. You can just as well put:

rwxPol = rwx

In your rwx.dpl, and then ask the policy tool to compile osv.frtos.rwx.rwxPol. But still I agree we should remove the need for this and allow osv.frtos.rwx.rwx as the top-level policy directly)

sbrookes commented 5 years ago

I still feel like my namespace is polluted here. I would like a use case where the last component of the module name is both a filename and a policy name... so I can have osv.frtos.rwx

sbrookes commented 5 years ago

Maybe we can have some notion of a "main" policy per file which is the default policy if my module only goes to the file level and doesn't specify a specific policy name

ccasin commented 5 years ago

I'm fairly anti adding a "default" name, which seems like a gotcha that will confuse newcomers. I'm not quite sure what you mean by a "polluted" namespace - in a system with hierarchical modules, a.b.c.d is a pretty typical way to refer to a value d from the module a.b.c. Of course, if it feels awkward that your module and the value in it have the same name, you can always change one.

sbrookes commented 5 years ago

Not having a one-to-one mapping between a file and a policy (especially one with the same name that is the only thing declared in that file) is a gotcha that confuses me every single time I write a new policy.

I just want to write a policy in a dpl file, point to that file, and have my policy be used. having to say a.b.c.d when d is the only thing in c is cumbersome. I just want osv.frtos.heap because that is enough to describe which policy I want -- the "open source version" (whatever) heap policy for frtos. In order to make that name work right now, I'd need to make a frtos.dpl which includes heap = heapPol... thats a pointless statement and a pointless file. Plus in order to make my other frtos specific policies follow the same pattern (e.g. osv.frtos.xxx), that file would also have to include xxx = xxxPol -- this will increase policy parsing time for osv.frtos.heap just so that I can use names that make sense. Instead, I want to have a single file: osv/frtos/heap.dpl. If you don't like the notion of a specially named policy (e.g. main) thats fine, but if there's one policy in there, I clearly want to use that policy when I say osv.frtos.heap.

Note that I switched to heap instead of RWX because RWX is actually the same for frtos vs. bare metal runtimes and so it should really only have to be osv.rwx. Trying to get that name to work has a list of equivalent problems.

Side note, this is moving pretty far from Amanda's issue. Maybe we should migrate elsewhere?

sbrookes commented 5 years ago

I will just add that in the case of rwx, even osv doesn't make sense and so I really want to just name my policy rwx which is literally impossible given the current module resolution process.

ccasin commented 5 years ago

I think the idea to replace osv/frtos/main.dpl with osv/frtos.dpl is a good one, and one I was also thinking about last night. I also continue to agree that we should eliminate the need to do rwxPol = rwx. I disagree with your other intuition about names here (in particular I much prefer the consistency of having to specify a particular policy to the policy tool), and ours is a pretty typical scheme for hierarchical modules. But this argument doesn't seem to be getting anywhere - perhaps we can discuss more when I get back.

arunthomas commented 5 years ago

Could we drop or rename osv while we're at it?

ccasin commented 5 years ago

Yeah - osv is a holdover from when we first branched from Dover, and there are probably better options for a top-level module name. Do you have suggestions? We could declare the stuff in hope-policies to be our "standard policy library" and pick a name like standard, std, basics, or prelude.

sbrookes commented 5 years ago

Hooray for continuing to hijack Amanda's thread!

I don't know if we need any name... none of those words add anything that helps describe the policy for me...

ccasin commented 5 years ago

I think it makes sense to put our "standard" policies in a single module, to distinguish them from policies that may arise from other sources. But I'm certainly open to names better than osv.

arunthomas commented 5 years ago

Sounds reasonable. I like std or prelude.

sbrookes commented 5 years ago

Where does prelude come from? Between them I'm much more in favor of std.

arunthomas commented 5 years ago

http://hackage.haskell.org/package/base-4.12.0.0/docs/Prelude.html

std is fine with me. It's shorter to type.

amstrnad commented 5 years ago

There are two relatively easy performance improvements I've found.

The vast majority of memory usage is still in the lexer/parser.

As is, this isn't causing problems, but if we keep adding to main.dpl, it will.