riscvarchive / riscv-linux

RISC-V Linux Port
606 stars 210 forks source link

Default Behavior of SR_XS Differs from Proxy Kernel #32

Closed seldridge closed 8 years ago

seldridge commented 8 years ago

Running a program with Xcustom instructions works by default in the proxy-kernel, but not in riscv-linux. This has been brought up before (http://stackoverflow.com/questions/32980262/how-come-linux-kernel-interferes-the-execution-of-risc-v-custom0-instruction-on), but without a resolution.

This happens because the SR_XS csr is never initialized by riscv-linux. This causes any instances of Xcustom instructions (e.g., custom0) to result in an illegal instruction exception (see this line in rocket: https://github.com/ucb-bar/rocket/blob/bcf035f4e4ac6685ef811013a20b3dab5a9c9046/src/main/scala/rocket.scala#L197). This contrasts with the initialization of the proxy-kernel in mstatus_init (https://github.com/riscv/riscv-pk/blob/529a6a3a0c42468bf815255697279e0e059a22db/pk/minit.c#L9) which sets SR_XS.

While I fully understand that the usage case is much different for Linux vs. the proxy-kernel, what should the intended behavior be for attached RoCC devices? Should SR_XS be enabled globally (for both user and supervisor), should SR_XS be enabled at the user level only (like it currently is for the floating point unit), or should this be a config parameter? Note, I glanced at the configs, but didn't see a specific setting for this.

At the user level RoCC devices can be enabled with a quick modification to the start_thread function (https://github.com/riscv/riscv-linux/blob/master/arch/riscv/kernel/process.c#L53), but I'm wondering what would be the right global approach that would prevent confusion between behavior on the proxy-kernel and riscv-linux.

aswaterman commented 8 years ago

This is actually deliberate. There is no presently support for swapping extension state on context switches, so we did not want to enable them and cause unexpected runtime failures because two processes were stomping on each other's state. The proxy kernel does not have this concern, since it only hosts one process.

The eventual intent is to use SBI calls to swap extension context, so the kernel doesn't need to know anything about accelerators except how large their context is.

For now, if you know you'll only be running one process at a time that uses the extension, you can manually hack the kernel to set XS.

On Fri, Feb 26, 2016 at 11:52 AM, Schuyler Eldridge < notifications@github.com> wrote:

Running a program with Xcustom instructions works by default in the proxy-kernel, but not in riscv-linux. This has been brought up before ( http://stackoverflow.com/questions/32980262/how-come-linux-kernel-interferes-the-execution-of-risc-v-custom0-instruction-on), but without a resolution.

This happens because the SR_XS csr is never initialized by riscv-linux. This causes any instances of Xcustom instructions (e.g., custom0) to result in an illegal instruction exception (see this line in rocket: https://github.com/ucb-bar/rocket/blob/bcf035f4e4ac6685ef811013a20b3dab5a9c9046/src/main/scala/rocket.scala#L197). This contrasts with the initialization of the proxy-kernel in mstatus_init ( https://github.com/riscv/riscv-pk/blob/529a6a3a0c42468bf815255697279e0e059a22db/pk/minit.c#L9) which sets SR_XS.

While I fully understand that the usage case is much different for Linux vs. the proxy-kernel, what should the intended behavior be for attached RoCC devices? Should SR_XS be enabled globally (for both user and supervisor), should SR_XS be enabled at the user level only (like it currently is for the floating point unit), or should this be a config parameter? Note, I glanced at the configs, but didn't see a specific setting for this.

At the user level RoCC devices can be enabled with a quick modification to the start_thread function ( https://github.com/riscv/riscv-linux/blob/master/arch/riscv/kernel/process.c#L53), but I'm wondering what would be the right global approach that would prevent confusion between behavior on the proxy-kernel and riscv-linux.

— Reply to this email directly or view it on GitHub https://github.com/riscv/riscv-linux/issues/32.

seldridge commented 8 years ago

That does makes sense. You can define state save/restore functions for the floating point unit, but obviously not for arbitrary extensions (unless they adhere to some context size like you mention). Thanks for the explanation.

I'll proceed with my own modifications to the kernel.