radareorg / ideas

4 stars 1 forks source link

Support for multiple address spaces (harvard architecture) #226

Open bennofs opened 6 years ago

bennofs commented 6 years ago

Some architectures, such as Atmel AVR, have separate address spaces (in the AVR case, there is a "flash" address space for program code and a "ram" address space for data). For example, this is valid AVR code:

sts 0x2000, ...   ; store some data at address 0x2000
call 0x2000       ; call function at address 0x2000

Here, the first 0x2000 refers to the address 0x2000 in RAM, while the second one refers to the address 0x2000 of the ROM. R2 currently cannot handle this, as it expects a single address space and does not differentiate between data/code addresses.

radare commented 6 years ago

Thats suposed to be done via the new rio api. Esil just needs a way to tell that. Its also necessary for gb emulation

On 21 Jan 2018, at 14:45, Benno Fünfstück notifications@github.com wrote:

Some architectures, such as Atmel AVR, have separate address spaces (in the AVR case, there is a "flash" address space for program code and a "ram" address space for data). For example, this is valid AVR code:

sts 0x2000, ... ; store some data at address 0x2000 call 0x2000 ; call function at address 0x2000 Here, the first 0x2000 refers to the address 0x2000 in RAM, while the second one refers to the address 0x2000 of the ROM. R2 currently cannot handle this, as it expects a single address space and does not differentiate between data/code addresses.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

radare commented 6 years ago

But for static code analysis we will Need to think more on this. Maybe we want to specify an io layer to work and swap the io map to read from yhe right place

On 21 Jan 2018, at 14:45, Benno Fünfstück notifications@github.com wrote:

Some architectures, such as Atmel AVR, have separate address spaces (in the AVR case, there is a "flash" address space for program code and a "ram" address space for data). For example, this is valid AVR code:

sts 0x2000, ... ; store some data at address 0x2000 call 0x2000 ; call function at address 0x2000 Here, the first 0x2000 refers to the address 0x2000 in RAM, while the second one refers to the address 0x2000 of the ROM. R2 currently cannot handle this, as it expects a single address space and does not differentiate between data/code addresses.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

astuder commented 6 years ago

Is there a good place to learn about the new rio api?

One approach would be to add a second qualifier "addr_space" to all memory addresses used in flags, esil, xref, RAnal_Op, wx/p8 @ etc., or replace memory addresses with a structure that consists of address and address space.

Arch and cpu would define required address spaces with name and valid address range. Arch specific anal implementation populates RAnal_Op and generates ESIL based on op codes (AVR, 8051) or advanced code analysis (6502/6510 bank switching). Archs could also malloc and initialize the required memory, independent of the loaded (code) image.

Address spaces could have further qualifiers like for example a read-only flag, or mapping to a physical address space.

More address spaces could be added by user through r2 commands.

I think this could also be helpful to emulate x86 segments like used in real-mode protected mode DOS programs, unless that's already a solved problem in r2 land.

radare commented 6 years ago

here’s condret’s talk on r_io

https://www.youtube.com/watch?v=fs3VpUECsWg https://www.youtube.com/watch?v=fs3VpUECsWg

On 26 Jan 2018, at 01:42, Adrian Studer notifications@github.com wrote:

Is there a good place to learn about the new rio api?

One approach would be to add a second qualifier "addr_space" to all memory addresses used in flags, esil, xref, RAnal_Op, wx/p8 @ etc. (or replace memory addresses with a r_mem_ptr structure that consists of address and address space.

Arch and cpu would define required address spaces with name and valid address range. Arch specific anal implementation populates RAnal_Op and generates ESIL based on op codes (AVR, 8051) or advanced code analysis (6502/6510 bank switching). Archs could also malloc and initialize the required memory, independent of the loaded (code) image.

Address spaces could have further qualifiers like for example a read-only flag, or mapping to a physical address space.

More address spaces could be added by user through r2 commands.

I think this could also be helpful to emulate x86 segments like used in real-mode DOS programs, unless that's already a solved problem in r2 land.

in real mode x86 the segments are just a shifted address plus the lower address to cover 1MB of ram, what it really makes sense to have this is to emulate segments in protected mode, like having TLS and such. this is not yet possible in r2, esil doesnt emulate thread-local-storage stuff because segmented memory is not handled properly.

yeah, having some discussion on this may help define the way to go for this.

—pancake

radare commented 6 years ago

cc @condret similar to gb bankswitch

ret2libc commented 4 years ago

This issue has been moved from radareorg/radare2 to radareorg/ideas as we are trying to clean our backlog and this issue has probably been created a long while ago. This is an effort to help contributors understand what are the actionable items they can work on, prioritize issues better and help users find active/duplicated issues more easily. If this is not an enhancement/improvement/general idea but a bug, feel free to ask for re-transfer to main repo. Thanks for your understanding and contribution with this issue.

trufae commented 4 years ago

@condret would you like to work on this with me? I have some ideas to implement multiple adddress spaaces depending on access type or mode. I know the basic primitives of priorizing maps in siol are there but we may probably want to add support for this logic in io directly. Also, any other re tool support this? How it’s managed by them? This is also a thing for PIC, AVR and must be done if we want to support multiple address spaces.