khval / NallePuh

Paula audio and CIAA/CIAB emulation for AmigaOS
GNU General Public License v2.0
4 stars 0 forks source link

AHI crash (version 1.2) #25

Closed samo790 closed 11 months ago

samo790 commented 1 year ago

Just by testing some settings i was able to reproduce a crash, probably related to AHI ..

Crashlog_nallepuh_2023-10-02_19-42-26.txt

khval commented 1 year ago

Try to describe what you did before it crashed.

Starks trace is useless no debug symbols.

module DEVS:ahi.device at 0x6FB6C2F4 (section 0 @ 0xE2D0)
module DEVS:ahi.device at 0x6FB5F1A0 (section 0 @ 0x117C)
module nallepuh at 0x6F8F0B40 (section 0 @ 0x1B1C)
module nallepuh at 0x6F8F3FA8 (section 0 @ 0x4F84)
module nallepuh at 0x6F8F4198 (section 0 @ 0x5174)
module nallepuh at 0x6F8EF73C (section 0 @ 0x718)

r29 is NULL, so it’s a NULL pointer bug, but what causes it I don’t know.

*6fb6c2f4: 813d01b4 lwz r9,436(r29)

khval commented 1 year ago

Interrupt should technically happen in supervisor mode, without multitasking, so we are not doing that. (what’s this means is things can change while we are in a interrupt. Or while AHI doing something.)

I have been thinking about if should mutex protect interrupts, but when read about this thing, exception should possible inside exceptions, so perhaps not good.

in forbid state can be possible, but need keep track of number of forbids, in a nest. but nallepuh need multitasking to keep track of time. and I expect AHI needs multitasking as well.

there are some real challenges here. This can be a lot of cases that’s non fixable.

samo790 commented 1 year ago

Mmm seems there is an "easy way" to reproduce this issue Just extract the archive (without installing anything) then launch NallePUH and press "Activate" button ... then press the "Disactivate" button ---> Crash

Hypexed commented 1 year ago

Actually I found it can crash by just pressing Activate. Changing audio mode didn't help. I wonder if needs Update 2? That was my Update 1.5 install. On my Update 2 install it loads up and activates.

Need to put it through the ringer. Find where it's getting stuck. It's an ISI so not very useful.

khval commented 1 year ago

Yes, it's ciaa, ciab resources that not installed, I think, I was thinking to make it a option, and just check if it's open, at place it's used

On Sat, 7 Oct 2023, 08:22 Hypexed, @.***> wrote:

Actually I found it can crash by just pressing Activate. Changing audio mode didn't help. I wonder if needs Update 2? That was my Update 1.5 install. On my Update 2 install it loads up and activates.

Need to put it through the ringer. Find where it's getting stuck. It's an ISI so not very useful.

— Reply to this email directly, view it on GitHub https://github.com/khval/NallePuh/issues/25#issuecomment-1751620510, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIDFDJURJ5YSIUMRN52RTDX6DYLNAVCNFSM6AAAAAA5TZFRLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJRGYZDANJRGA . You are receiving this because you commented.Message ID: @.***>

Hypexed commented 1 year ago

That shouldn't affect it. Since the crash was in NallePUH itself. The conflict is something in the system. Previous version didn't crash I had installed. Could be AHI related since activating would start AHI playing. But it passed all the open and version checks.

khval commented 1 year ago

No that doesn't make sense, it's when it starts playing sounds, there can be risk of dsi, if points point to bad sample data for example.

Isi error, is a little more special. Because then there has to be bad function pointer

On Mon, 9 Oct 2023, 08:09 Hypexed, @.***> wrote:

That shouldn't affect it. Since the crash was in NallePUH itself. The conflict is something in the system. Previous version didn't crash I had installed. Could be AHI related since activating would start AHI playing. But it passed all the open and version checks.

— Reply to this email directly, view it on GitHub https://github.com/khval/NallePuh/issues/25#issuecomment-1752394721, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIDFDJIIVBE2RVU5JMT3XLX6OILHAVCNFSM6AAAAAA5TZFRLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJSGM4TINZSGE . You are receiving this because you commented.Message ID: @.***>

khval commented 1 year ago

Debug version was sent to samo790, he failed to verify the problem closing case.

Hypexed commented 1 year ago

Can you send to me also? Thanks. If the binary matches a debug version can be used with the section offset and will give the source line if possible.

So I've looked into how AHI does it. In the library mode AHI needs to be given the play command will put it into playback mode. After this it will accept audio functions playing samples, changing volume and pitch. Until it's given the stop command. Until AHI is told to play it wil emit silence and the audio functions have no effect.

AHI does work with an interrupt but only the hardware interrupt would need to be in supervisor. That's out of the scope of an AHI client because it has an audio buffer reserved for the hardware. However, for clients sending audio, that would also be in an interrupt but might only be a soft interrupt. In any case, on the off chance a NULL pointer is sent to AHI as a sample, it will attempt to fill the buffer with it and will crash in an interrupt. I've actually done this when working on The Maestrix. Somehow OctaMED was sending NULL audio pointers to the play routine. I suspect my RAM was bad as it didn't make sense but it did my head in trying to debug an interrupt handler when the system crash logger has no concept of what an interrupt is and was constantly confused.

khval commented 1 year ago

Ok, I’m reopening the issue if you want to look into it, 1.2 is already a bit old, its better we work on the latest code, I’m trying to improve build process, for some reason, it seams it links in old objects. “make all” should build debug version and normal version.

samo790 commented 1 year ago

AHI crashed seems fixed here (tested on version 1.4)

khval commented 1 year ago

Probably should give a warning, when CIAA/CIAB timer tables are in use when pressing deactivate. And perhaps when your trying to quit NallePuh, while the resources are in use.

(it looks like Hippo Player does release all timers, it looks like it gets confused by CIAA TimerA/B been free, as under OS3.x its used by the OS)

khval commented 1 year ago

Found bug with TEST button, now only possible to press, after you have pressed Activate.

Hypexed commented 1 year ago

I first tried to cross compile it. But the includes have different case in file names between source code reference and headers. Plus it's missing files apart from that like libblitter includes which aren't included here. Will need to just locate all needed and compile on X1000. Too much stuffing around trying to quickly compile on laptop. :)

Hypexed commented 1 year ago

Yes warning when resources in use would be good so you don't pull the ground out from beneath your feet. You may have found the HippoPlayer issue. I've found it seems to work fine, at least with CIAgent, then crashes in the end. It may not be coded correctly then. If it wants a specific timer it should allocate it and fail without it. You can't allocate the next free timer in the resources, you must specify what you want. So code that wants to allocate next free timer needs a subroutine that tests for both A and B timers in both resources. This may explain Foundation as well, that allocated one timer but then freed another. Lots of Amiga code may be buggy and we are finding out now! it just looks hard to me to allocate a specific timer but then free another. Do people even test their code? :-?

khval commented 1 year ago

Yes, I know, you most likely have the wrong header files.

I will upload to github when I get home, what you're missing, there will be a different git repo for ciaa/ciab libs.

fre. 27. okt. 2023 kl. 06:06 skrev Hypexed @.***>:

I first tried to cross compile it. But the includes have different case in file names between source code reference and headers. Plus it's missing files apart from that like libblitter includes which aren't included here. Will need to just locate all needed and compile on X1000. Too much stuffing around trying to quickly compile on laptop. :)

— Reply to this email directly, view it on GitHub https://github.com/khval/NallePuh/issues/25#issuecomment-1782256931, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIDFDNKAO6YM2KWXFFPT4LYBMXKRAVCNFSM6AAAAAA5TZFRLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSGI2TMOJTGE . You are receiving this because you modified the open/close state.Message ID: @.***>

khval commented 1 year ago

Download AvailCIA from Aminet, it helps you see what resources are in use. Yes, looks like lot cut and paste coding, without any code review, most 68K code are coded on computers without an MMU, but a MMU won’t find, poorly written code, it only finds the worst code, most of code written, when people who wrote it was just kids, without any experience, yeh there seams be lot of small bugs in Hippo Player. I’m not concerned about this right now.

fre. 27. okt. 2023 kl. 06:13 skrev Hypexed @.***>:

Yes warning when resources in use would be good so you don't pull the ground out from beneath your feet. You may have found the HippoPlayer issue. I've found it seems to work fine, at least with CIAgent, then crashes in the end. It may not be coded correctly then. If it wants a specific timer it should allocate it and fail without it. You can't allocate the next free timer in the resources, you must specify what you want. So code that wants to allocate next free timer needs a subroutine that tests for both A and B timers in both resources. This may explain Foundation ass well, that allocated one timer but then freed another. Lots of Amiga code may be buggy and we are finding out now! it just looks hard to me to allocate a specific timer but then free another. Do people even test their code? :-?

— Reply to this email directly, view it on GitHub https://github.com/khval/NallePuh/issues/25#issuecomment-1782262540, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIDFDPAN6UBMRSAL6LNID3YBMYHLAVCNFSM6AAAAAA5TZFRLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBSGI3DENJUGA . You are receiving this because you modified the open/close state.Message ID: @.***>

khval commented 1 year ago

here are some of the files you need to build, nallepuh. header files you find in subdirs. https://github.com/khval/AmigaTimerResources

Hypexed commented 1 year ago

Thanks. Okay I may have found the issue. At least I have found an issue. I ran the latest OS4Depot build, 1.4, which still crashes. I used the section offset in the debug build to grab the line which is here in timer.c at 125: https://github.com/khval/NallePuh/blob/e98ff5399dd19491567a00f4cb13d86e1f70a0ce/timer.c#L125C5-L125C5

Trace using custom Line script calling addr2line :

line 0x8f1c "RAM Disk:NallePuh/nallepuh.debug" 
_open_timer
/Work/pro_emu/NallePuh/timer.c:125

line 0x5a48 "RAM Disk:NallePuh/nallepuh.debug" 
HandleGUI
/Work/pro_emu/NallePuh/gui.c:840

line 0x77c "RAM Disk:NallePuh/nallepuh.debug" 
main
/Work/pro_emu/NallePuh/Nalle.c:243

The line looks fine so it's likely after the timer is started. I noticed the timer isn't cleared. This should be handled by handel_timer() but it instead restarts the old timer without clearing it the off old one. I don't see it pull the old request off except when closing the timer but it's too late by then.

The timer.device is quirky and one of those Amiga devices where they chose not to always use the standard device API. I think being non standard just made it confusing. Suggesting alternate functions looking like a hack.

Any way looks like it needs a simple GetMsg() to fix it. Below they use WaitPort with GetMsg(). https://wiki.amigaos.net/wiki/Timer_Device

khval commented 1 year ago

@Hypexed We don’t need WaitPort, as we use Wait() to check for signal, but need probly should do a GetMsg(), but its confusing the example does not use ReplyMsg() on the timer message, should it always be used?

khval commented 1 year ago

Never mind, as GetMsg returns IoRequest, and its already know, perhaps why its not required to have reply message, the example seams to suggest double/Trippel buffering the IoRequest.

Hypexed commented 11 months ago

Yeah. The IO send on our side would act like PutMsg() so when it comes back to us the timer device would bounce it back with ReplyMsg(). Thus why we only need GetMsg(). I suppose it's like normal messaging. A GetIO() type function would have helped so it looked consistent.

khval commented 11 months ago

I think it’s time to close this case, nothing being added or talked about has anything to do with the initial bug report.