keystone-enclave / keystone

Keystone Enclave (QEMU + HiFive Unleashed)
Other
466 stars 134 forks source link

SM trap handler gives "illegal instruction" on CVA6 #374

Closed jarkkojs closed 1 year ago

jarkkojs commented 1 year ago

I've mentioning about the CVA6 freeze before but could not properly debug them because the driver was also acting sometimes weird. Now that I rewrote the Keystone driver I got further with this.

What is happening is that there is a storm of time interrupts and SDK ends up making resume ioctl's indefinitely. It just stays in the loop and returns all the time with SBI_ERR_SM_ENCLAVE_INTERRUPTED.

If I run the same fw_payload.bin in QEMU this does not occur and ./tests.ke runs perfectly to the end.

Any similar experiences with hardware and ideas what I should look at? It is Genesys 2 FPGA with CVA6.

Also I checked that it never gets to do edge call so nothing is dispatched and this happens early on when the enclave starts running.

dkohlbre commented 1 year ago

Hey @jarkkojs, I want to first emphasize that I genuinely appreciate many of your contributions here: bugs, threads of debugging, patch proposals, etc. You've put in a ton of work that is both valuable and desired for Keystone.

Going forward I want to give both some context about how Keystone is managed, and some requests to you that will help us engage productively.

Keystone started as a research project with two people working on it (technically, one student doing a course project!). Over time it has evolved and gained/lost contributors, almost all of whom are either undergraduate or graduate students. As a result, goals and systems in Keystone have changed over time driven by the current research project a lab is working on. Code quality has never been up to standards it could be, projects get abandoned when students graduate, etc.

Decisions usually happen in 1-1s with students or research meetings. This has worked well for us to do academic research internally, and is pretty bad for external folks to understand what we're up to. We can improve here.

Keystone is very unlikely to have the kind of active and communicative community leadership that I think you are looking for. We simply don't have the people to do it. (e.g. I've meant to reach out about your contributions and how we can be more effective in enabling you, but this is the first chance I've had since you started working on things!)

There are a few things we can try and do to improve this. They are all going to be slow, since being on the academic side means we can't throw resources into it. I can see a few concrete suggestions you've got that we can and will act on:

These are things that we can do better on, and I'm absolutely going to talk with the other folks working on projects in Keystone about it.

I'd like to ask for several things from your side:

If a call would help figure some of this out, we can find a time to do so. Shoot me an email at dkohlbre@cs.washington.edu

(FWIW, there was at one point a public roadmap and more regular communication on mailing lists etc. Then people graduated, commitments changed, and different projects started.)

jarkkojs commented 1 year ago

I agree with most of your points, and I'm sorry just been somewhat frustrated after a month digging into same problem :-) And I don't think it is "project's fault" what I'm after. Getting page table sync right is super difficult problem across multiple platforms, even on the same ISA. I created a more focused bug report last night, so I'll continue there. I think where things fall a part is stvec update but I'll report my findings (and just my findings not frustration) here: https://github.com/keystone-enclave/keystone/issues/378

You are free to comment on that but I'll continue to converge to the root cause, which has narrowed down over the month so at least there's been some progress.

I'll put the email to my address book. Thank you and apologies :-)