polarfire-soc / polarfire-soc-documentation

PolarFire SoC Documentation
Other
37 stars 19 forks source link

Using L2 as scratchpad on top of Linux OS #14

Closed dawsfox closed 2 years ago

dawsfox commented 2 years ago

I am unable to find documentation to use the L2 as a scratchpad on top of running the Linux OS. I've found plenty of documentation on configuring the L2 as scratchpad using the MSS configurator but as I'm running the Linux OS I've been unable to find an example linker script that wasn't intended for bare metal execution. I'm trying to find one because I need to explicitly store some data structures in the scratchpad portion memory. Thanks for any help you can provide.

dawsfox commented 2 years ago

For clarity: I'm trying to develop a program that involves manual management of the scratchpad memory, specifically making sure certain data structures are stored there for speed purposes and an application specific caching scheme for those structures. I'm writing this program from within the Linux OS (and using its default compiler) on the icicle kit that was built in accordance with the meta-polarfire-soc-yocto-bsp repo guide. I am assuming because the program will be executed through the Linux OS instead of as a bare metal application that I will need to explicitly map those data structures to the disabled cache ways (the scratchpad area) in the L2. I haven't been able to find any documentation referring to use of scratchpad while using the Linux OS. Am I missing something?

griffini commented 2 years ago

We don't tend to use the scratchpad in Linux.

Currently, the HSS will setup 4-ways as L2 Scratchpad

[19.23509] HSS_DDRPrintL2CacheWaysConfig(): L2 Cache Configuration:
    L2-Scratchpad:  4 ways (512 KiB)
         L2-Cache:  8 ways (1024 KiB)
           L2-LIM:  4 ways (512 KiB)

Right now, the HSS is using the first 320KB of this 512KiB. So you could for example, use the remaining 192KiB?

If you need more, you can reallocate the LIM ways to scratchpad - this will require modifications to the design configuration XML (either manually, or using the Libero PolarFire SoC MSS Configurator) - for example, https://github.com/polarfire-soc/hart-software-services/blob/master/boards/mpfs-icicle-kit-es/soc_fpga_design/xml/ICICLE_MSS_mss_cfg.xml for the Icicle Kit.

You can see a memory map with this in the HSS linker script.

How are you planning on accessing this physical address from Linux? Using something like /dev/mem?

dawsfox commented 2 years ago

Thanks for your response. I have already reconfigured the L2 using the MSS Configurator to expand the scratchpad region, so my main question was whether or not there was an existing or established way to access the L2 scratchpad region from an application in the Linux OS (your response indicates there isn't one). I was hoping there was some sort of software mechanism to indicate a particular data structure should be placed in that region. I will look into ways to access the physical address from Linux unless you have other suggestions. Once again, I appreciate your response!

griffini commented 2 years ago

Thanks for your response. I have already reconfigured the L2 using the MSS Configurator to expand the scratchpad region, so my main question was whether or not there was an existing or established way to access the L2 scratchpad region from an application in the Linux OS (your response indicates there isn't one). I was hoping there was some sort of software mechanism to indicate a particular data structure should be placed in that region. I will look into ways to access the physical address from Linux unless you have other suggestions. Once again, I appreciate your response!

That is why I was asking if you plan to use /dev/mem?

See https://bakhi.github.io/devmem/ for example.

You can open /dev/mem, and mmap it in a userspace process to get access to physical memory.

dawsfox commented 2 years ago

With regards to the /dev/mem method, I have some concerns. Mainly, the application I'm writing necessitates multiple cores having access to the data structures in the scratchpad region (which as the resource you sent mentions, can break cache coherence since the scratchpad is a cacheable region). Also, since the application I'm writing is geared towards high performance, I'd like to avoid the method mentioned in the link that would make the region non-cacheable. To your knowledge would there be any custom way to integrate the scratchpad region as I have it configured into the virtual address space and provide a method to place specific data into the region (or even better, into a specific memory bank of that region)?

dawsfox commented 2 years ago

Or, now that I'm thinking about it further, if all the cores are only accessing the scratchpad region through /dev/mem I expect there would be no coherence issues since all of the writes/reads through /dev/mem would bypass the cores' L1 caches altogether and there is no other way for that region to be accessed (or perhaps I'm misunderstanding). Regardless, I would still like to make use of the regions cacheability if possible.