Liblor / advanced_operating_systems_2020

Advanced Operating System Course at ETHZ
MIT License
19 stars 4 forks source link

Milestone6 #124

Closed Liblor closed 4 years ago

Liblor commented 4 years ago
Liblor commented 4 years ago

Note: debug_printf can't print the long_string (it stops at around 1000). Weirdly this breaks our rpc now, which it didn't previously...

abertschi commented 4 years ago

Known bugs:

eikendev commented 4 years ago

We wanted to implement the receive interface in a way that allows the caller to do all the memory allocations. Now, we apparently didn't explain to you how to use the receive function, so here's a small guide: When you want the function to do the allocations for you, pass a reference to NULL. Otherwise, pass a reference to the message struct.

This is not a great interface, but it's simple and was easy to implement. Any ideas for improving on this?

leopoldsedev commented 4 years ago

Known bug:

eikendev commented 4 years ago

Same happens with running the test on core 0, and spawing on core 0.

Liblor commented 4 years ago

Known bug:

* [ ]  Running `rpc-test` on core 1 and configuring it to spawn on core 1 gets stuck after a few spawns.

Any progress on this from your side? We are also investigating.

eikendev commented 4 years ago

It appears to be a problem in paging/morecore. We think this is a synchronization issue between threads. @leopoldsedev, created a smaller example where it also fails.

We should probably have a plan B in case this doesn't resolve until evening. One possibility is to have a "plan B" branch where we don't use threads in the monitor.

Liblor commented 4 years ago

We believe the issue is caused by morecore and the way we switch between the static and dynamic core. Removing the functionality of free allows one to spawn ~90 dispatchers, before some assertion error occurs. The location where the thread seems to be stuck is inside malloc (in a infinity loop).

Liblor commented 4 years ago

We should probably have a plan B in case this doesn't resolve until evening. One possibility is to have a "plan B" branch where we don't use threads in the monitor.

Hm, that is certainly not a bad idea, but would probably require the message to have some identifier. Or is there another possibility to know where to send the response back to, if the monitor is nonblocking?

Liblor commented 4 years ago

I pushed a workaround on milestone6-workaround. That way free is only allowed to be called on memory allocated in the same mode. E.g. if you allocate memory in the static morecore mode, you must free it in the static morecore mode.

On this branch I could spawn ~150 dummy dispatchers on one core before running out of memory.

leopoldsedev commented 4 years ago

I pushed a workaround on milestone6-workaround. That way free is only allowed to be called on memory allocated in the same mode. E.g. if you allocate memory in the static morecore mode, you must free it in the static morecore mode.

Seems to be going into the right direction. In my test multiple threads call malloc() and then free(). The test gets stuck exactly after one of the threads called free() and another thread calls malloc() afterwards. If I remove the free() calls it runs through.

Liblor commented 4 years ago

Okay, it should now be fixed on the workaround branch!

@abertschi Said he just encountered a problem on the branch though...