image-et-son / p600fw

GliGli based Prophet 600 firmware upgrade
10 stars 4 forks source link

Still under investigation: Switching from poly mode to unison with the seq running leads to CV glitches. On a still cold p600 #67

Closed el-folie closed 2 years ago

el-folie commented 2 years ago

Okay, sorry, not a bug per se, seems to be completely related to cold hardware not being ready for fast CV action for about 20 minutes. After that no glitchy behaviour anymore. Will monitor this further though. And this doesn´t occurr on the Z80, even with a still cold p600. So it must be a timing relevant problem connected to the much faster update rate and higher precision of the Teensy. On the Z80 this problem probably doesn´t occurr because of the slower refresh rate and lower resolution, so much more forgiving in terms of variances.

Weird behaviour: I tested this through all imogen alphas back to gligli 2.1rc3 (no seq on 2.0) - so it´s an old bug or glitch. There is no such glitchyness with the Z80 in place, so it must be a gligli software issue. The glitches don´t always occurr at slow seq speeds but always at high seq speeds.

  1. It sounds as if the multitude of simultaneous "note ons" are freaking out the DAC timing-wise. My first thought was, that on the Sequential OS they maybe managed to prevent this by tiny wait-states between the simultaneous "note-ons".
  2. BUT - it also sounds as if the "envelope hold" aspect in unison mode might be the problem, because sometimes the DAC spits out three same pitched notes in a row even though the sequence is high-low-high-low with C0 and C5. So it seems as if the DAC for a split second can´t decide if the pitch CV is meant for the high or the low note and makes a wrong decision. So the question could be, which parts before the DAC may lead to a wrong decision in timing of the CVs. I suspect timing is the critical thing with this bug or glitch.
  3. I now think the glitch is solely unison mode/hold-related because it also happens in one-note unison.
  4. I verified the problem by turning off all voices but one. I went through each voice individually, the glitch is happing in every single voice.
  5. Now the next weird thing: the issue seems to be related to the p600 still being cold right after 1st powering up. After warmed up for 20 min it seems to become more stable and the sequence running having less and less glitches. That would lead me more to the direction of hardware/chips not being at their operating point soon after powering up, leading to wrong timing/CV.
  6. What´s funny: in 6-note-unison the max seq speed is slower than in poly mode or single note unison. Weird machine...

I´ll need to investigate this further to know if it´s only my machine again.

I will do audio files to exemplify.

Just to be clearer, the glitches also happen in unison mode and then switching on a running sequence. Switching between poly and unison only instantly shows the difference between "no glitches - switching to unison - glitches". So the glitches are not connected to the switching itself but the unison mode.

(These random glitches made me crazy for years already, they were hidden/masked also by the former pitch wheel glitches, which have gone after imogen´s pitch wheel fix.)

matrix12x commented 2 years ago

I set up a patch in manual mode that matched you patch, except for "ext_voltage" I set to 00 (your patch was 48 in syx_converter) and vintage =00 (yours was 48 for this also) and unison detune=00 (was set to 48 also, weird) I set unison on (6 voice) I ran arp up/down latched at a slowish speed C1 to C6 for maybe 30 min in manual (panel) mode (in SCI and GLIGLI mode 15 min each) and mine never glitched. UGH!! In one way, thats good, in another I'm annoyed that I can't replicate this. I clearly hear it and see it in your audio.

Also @image-et-son in manual mode current Alpha, re: the Zero Load fix, My pitch bender (PB) does not function in manual (panel) mode. PB works fine in preset mode. All other panel knobs function fine (pitch for both VCOs, fine, etc)

image-et-son commented 2 years ago

Also @image-et-son in manual mode current Alpha, re: the Zero Load fix, My pitch bender (PB) does not function in manual (panel) mode. PB works fine in preset mode. All other panel knobs function fine (pitch for both VCOs, fine, etc)

This means the "zero load" problem is not solved :-( I only removed the most drastic effect.

As far es the glitches are concerned: I sometimes hear something where I think "What was that?" but then it's so rare I never followed it up. I cannot observe those consistent glitches of el-folie. I do think that it's perfectly possible that the main timer interrupt which set all the frequencies can interrupt the assigner an therefore create a short moment of lapse. I will look into that.

matrix12x commented 2 years ago

I would also like to clarify the bender bug, to be specific, Bender target seems to reset to zero in Manual mode. if I change it back to "ab" it works. then if I switch on and off power it goes back to off.

el-folie commented 2 years ago

I set up a patch in manual mode that matched you patch, except for "ext_voltage" I set to 00 (your patch was 48 in syx_converter) and vintage =00 (yours was 48 for this also) and unison detune=00 (was set to 48 also, weird) I set unison on (6 voice) I ran arp up/down latched at a slowish speed C1 to C6 for maybe 30 min in manual (panel) mode (in SCI and GLIGLI mode 15 min each) and mine never glitched. UGH!! In one way, thats good, in another I'm annoyed that I can't replicate this. I clearly hear it and see it in your audio.

Also @image-et-son in manual mode current Alpha, re: the Zero Load fix, My pitch bender (PB) does not function in manual (panel) mode. PB works fine in preset mode. All other panel knobs function fine (pitch for both VCOs, fine, etc)

The patch data must be corrupted somehow then, vintage/detune/ext volt. all are zero in my patch on the P600. And exactly that patch was what I sent to MIDIOX and uploaded here. Weird!

What I forgot to say, best to start with an ice cold P600 when hoping for the bug to appear ;-) Also arp speed to max and then going wild up/down with the speed to provoke cpu hiccups, when they appear one time, they then stay on my P600, even at slow speeds as demonstrated in the audio.

But I also found this: after letting the P600 sit and warm up for several hours, I can´t provoke the glitch anymore. So IMHO I guess it´s part a hardware fault and part a special software difference between SCI and GliGli that somehow interacts with my hardware. As said before, on SCI OS it doesn´t matter if the machine is still cold, no glitch, never. Ghost in the machine! Also a great Police album ;-)

el-folie commented 2 years ago

I would also like to clarify the bender bug, to be specific, Bender target seems to reset to zero in Manual mode. if I change it back to "ab" it works. then if I switch on and off power it goes back to off.

This is weird too - I cannot confirm the bender bug. On my P600 the bender target with the latest RC14_c2 stays intact after power cycle in manual mode. Never had this problem once with the new alpha.

el-folie commented 2 years ago

Also @image-et-son in manual mode current Alpha, re: the Zero Load fix, My pitch bender (PB) does not function in manual (panel) mode. PB works fine in preset mode. All other panel knobs function fine (pitch for both VCOs, fine, etc)

This means the "zero load" problem is not solved :-( I only removed the most drastic effect.

As far es the glitches are concerned: I sometimes hear something where I think "What was that?" but then it's so rare I never followed it up. I cannot observe those consistent glitches of el-folie. I do think that it's perfectly possible that the main timer interrupt which set all the frequencies can interrupt the assigner an therefore create a short moment of lapse. I will look into that.

Which hardware part of the P600 would/could influence the main timer interrupt in that way? Maybe I could work my way along the possible hardware parts to find the culprit. I just don´t have any clue where to start with this bug...

matrix12x commented 2 years ago

U320 on Board 3 is the interrupt timer, and it goes thru hex buffer U318:

image

el-folie commented 2 years ago

Great, thanks!

U318 was def renewed when I socketed the main board (at that time already because of the glitches, so U318 was/is new and so ok). I do have a spare U320 and U306. Would it make sense that those could age or be temperature dependent in their function? Or maybe, could the whole timing relevant circuit starting from the crystal be affected by age/temperature? Or would the whole CPU and machine glitch out completely if that was the case? So that only a portion of the timing relevant circuits may be the culprit?

Or maybe this: I didn´t renew the tantals on the CPU board - bad idea? Should I replace them all?

el-folie commented 2 years ago

Okay, this is totally crazy: I just exchanged U320 for the new one, so for a few minutes the P600 was powered off. Powered on and there are the glitches again... So, it´s not U320. Maybe it´s really power related, fluctuations of the regulators or spec inconsistencies of parts around the crystal. Maybe I should renew the whole power generation part on the main board and the tantals. But I think I´ll start with the 7493 divider chip tomorrow...

matrix12x commented 2 years ago

Crazy idea, what about heat sinking the teensy?

el-folie commented 2 years ago

As your and imogen´s machine don´t show the glitches and I assume you both didn´t "hotrod" your Teensy boards with heat sinks... I don´t know ;-)

Maybe I should also reflow the cpu socket. But the cause could be anything on the main board... it´s been like that from the start with earliest GliGLi releases.

matrix12x commented 2 years ago

I was actually thinking about measuring the temperature of the CPU on the teensy ++ and then dropping a small heatsink on just the teensy's CPU with some thermal compound, to see if I could squeeze some additional responsiveness out of it. mostly as an experiment.

What about using cold spray to isolate what component is causing the glitch? We used to do this when I was designing RF communications equipment. Spray (small amounts) using the little red tube that usually comes with the cold spray and target parts that have to do with either timing or pitch. It will immediately tell you what parts are at issue.

The reason I say cold spray is you mentioned that your P600 does this until warmed up. On a related note, why not wait until the unit is warmed up to use it? I mean for tuning stability you generally have to anyway.

el-folie commented 2 years ago

Ahh, I see, interesting that heat sink experiment would be, yes.

Thank you for explaining how to trouble shoot with cold spray. So far I only ever went by schematics/functions and thinking. A problem is, in my town there is no shop like "radio shack", so I need to order everything online. Maybe I could use a plastic cold pack from the freezer compartment? (At least I could try to touch ic chips with it but propbably not smaller parts like tantals)

The thing about waiting is that it´d take 2 hours or so before the glitches go. So a bit long to wait compared to the normal/usual 20-30 minutes for analog synths.

New idea: Very interesting would be the question as to why only live mode exhibits the glitches and almost never the preset mode. There must be a significant difference in how these two modes use timing relevant circuits - if we could find out when/how the GliGli OS does and does not select certain circuit parts in live and preset mode - that would lead to the exact cause and circuit part. Also interesting would be the question how does the SCI OS handle live and preset mode. Maybe the live mode is just an "always on" preset. Theoretically there shouldn´t be a reason why live mode behaves differently timing-wise than preset mode. But obviously I can´t know, maybe the teensy board just requires a different approach for both modes. Maybe Mr. GliGli himself would know what´s happening as he designed the OS/modes.

image-et-son commented 2 years ago

New idea: Very interesting would be the question as to why only live mode exhibits the glitches and almost never the preset mode.

Yes, I think there I could continue to analyze theoretically. It must be an implicit dependency. The modes are actually the same. The only difference is where the internal variables get the values from. During operation there is no difference. But there could be some hidden unwanted dependency.

Maybe Mr. GliGli himself would know what´s happening as he designed the OS/modes.

I have been in contact with Fabrice on our whole development activity (he agreed with it and also wanted to participate in testing and would post it on his page when it's done!). Ultimately I could ask him. But I also need to solve the zero load topic because that is a real bug which I introduced :-(

el-folie commented 2 years ago

Hi, okay, so the difference is if the (saved) values are being read from a memory location or read and applied live from the controls in live mode, which also is kind of a preset in itself. In my simple understanding I would assume that the only difference may be that in live mode some part of the code always expects/listens to changes more closely than in preset mode, where certain distinct patch parameters stay the same. And maybe, until completely warmed up, some timing relevant circuit doesn´t deliver/react fast enough to supply the required info or execute a function in time. But, like you assumed, if there maybe is an unwanted dependency that´d lead to another cause of course.

Super cool that Fabrice is on board! What a hero for disecting the old CPU/SCI OS and resurrecting the old P600 for the future. Also, totally curious about his development/involvement on the Pro800 and maybe also Pro16.

I never had the zero load issue again with RC14_c2 - but I assume you are searching for a more elegant bug fix than the brute force reload of live memory.

matrix12x commented 2 years ago

"Maybe I could use a plastic cold pack from the freezer compartment? (At least I could try to touch ic chips with it but propbably not smaller parts like tantals)"

Use the icepack to cool the end of a Q-tip and touch parts with the cold end of the Q-tip. We used to do that too. Warning: don't get parts wet.

el-folie commented 2 years ago

"Maybe I could use a plastic cold pack from the freezer compartment? (At least I could try to touch ic chips with it but propbably not smaller parts like tantals)"

Use the icepack to cool the end of a Q-tip and touch parts with the cold end of the Q-tip. We used to do that too. Warning: don't get parts wet.

Thanks for the tip! I tried it with the plastic edge of the ice pack yesterday but it didn´t influence the glitches, when held on ICs. I´ll have another try in the next days, am just very occupied atm.

image-et-son commented 2 years ago

A small update where I stand on the glitch topic. I have analyzed the process from pushing a key via voice assignments to application of voltages:

image-et-son commented 2 years ago

Another observation: I had some glitches recently, in particular, I had made some suboptimal decision to place the VCA modulation in the non-timed loop. I did this because the voltages of Vol A and Vol B used to be "slow" updates in the non-timed loop with much longer wait times in the DAC voltage setting process. But when I made this fast and moved it to the fast timed loop I got glitches, and btw. strongly temperature dependent too, e.g. mostly gone after warm up! So I changed a few parameters to see how far I can push this and this is my conclusion of this experiment:

@el-folie : I hope that the added wait times will solve the problem for you, too. Will send a trial version asap.

image-et-son commented 2 years ago

Hi, please try this version: put in safer wait times for multiplexer and DAC and I compensated his by other performance measures. Still open: 1) "zero load" problem (I don't have it - @matrix12x !?) and 2) possibly too long lag for first reaction of knob before they become "excited" an really responsive.

Does that version do something for the glitches?

p600firmware_alpha14_c_20220216.zip

el-folie commented 2 years ago

Imogen, great analyzing and findings! It´s so satisfying to be a part of this endeavor and finally seeing that some of my intuitions were right regarding warm-up/timing/glitches. Thank you for your perseverance!!! I wish I knew how to mod my P600 hardware-wise to be as responsive as yours and Matrix12x´s.

I just tried the test OS above. Test setup: max speed C1-C6 seq in unison, live and preset mode:

So, it´s safe to assume that the former warm-up dependent glitches really are exactly what you described above - the outcome of the/some P600 hardware not being able to follow the refresh rate at max possible teensy speed.

I hope this gives some pointers on how to tackle this problem best. I also feel like it would be best to bring my P600 hardware up to spec like yours if I only knew what to change...

At least we know now it´s not a bug per se but a hardware limitation not all P600 will have. Quite some discoveries! And I could have been searching the hardware "fault" forever, probably without any conclusion as there is no fault, just hardware not being up to speed. ;-)

el-folie commented 2 years ago

Glitch test recording with the test OS: glitch test OS recording.zip

At 00:03 when switching from live to preset you hear the first glitch (long note) on the button press. At 00:07 button press back to live mode = glitch (long note) At 00:11 slowly raising cutoff = glitches (double/triple triggering of same note) At 00:18 switching to preset and same procedure as above and same glitches After that some random knob tweaking, MIX, volume, speed, cutoff. One can hear that any knob tweaking reintroduces the glitches.

Hope it helps to find the right balance for multiplexer wait states/voltage raise timing for the P600 hardware limits.

Anyhow - exciting and excellent work! 👍

matrix12x commented 2 years ago

GliGli and I had a conversation in the early days of the P600 thread on gearspace that maybe the on/off times on some CD4051 chips is high (spec 350nS), and may get higher with age.

I postulated at the time that "I wonder if swapping them out with a DG9051 would yield even faster envelope times. or even the CD74HCT4051, which have much faster on and off times. from like 350nS to like 30-50 nS."

Gligli said "Could these aging CD4051s with the higher Ron be the issue with the tuning that some people have, but not others? I had previously assumed that Ron would not change over time. Bad assumption."

I said "Specifically U415 as an issue? As it is the one for the oscillators. The CD4051 datasheet has a Ron of approx 450 to 1000 ohms @ a VCC of +5VDC. That's with the 0.01uF cap, that time constant can be over 23 mS @1000 ohms (assuming 5VDC out)... I wonder if anyone with the tuning issue were to change out U415 (CD4051) with a new CD4051 if their issues would disappear? For fun, I may swap mine out with the DG9051, which has an Ron of 35-60 ohms."

All of that being said, I suggest swapping out the CD4051 with one of the above models. the main difference between the chips is that the CD chip can take a higher supply voltage, but this chip in the P600 only gets 5V, which works well with the 74HC4051.

BTW I had swapped my U415 and U416 CD4051 with a 74HCT4051 quite some time back in hopes of fixing the Tune issue I had. my unit seem to hate when the Tune button is pressed.

matrix12x commented 2 years ago

@image-et-son Should I try and refresh using .hex to see if my zero load of the pitch bender on power reset goes away?

image-et-son commented 2 years ago

I think it might be useful to consult Fabrice on this at some stage - I am sure he must have done a lot of experimenting around this to get the "stable release". Just for information: there are two speeds for setting voltages, a "fast path" and a normal one. The two differ in the wait cycles between the hardware operations as follows:

Fast Path:

  1. SET DAC VOLTAGE
  2. WAIT(1)
  3. SELECT CV
  4. WAIT(1)
  5. DESELECT
  6. (NO WAIT)

Normal:

  1. SET DAC VOLTAGE
  2. WAIT(4)
  3. SELECT CV
  4. WAIT(8)
  5. DESELECT
  6. WAIT(8)

As you can see there is a huge difference! The wait times are a significant part of the performance limitation, or - ultimately - the limiting factor. This is also true with control readout functions where similar wait cycles are implemented. In the warm up glitches I changed the update of volume A and volume B from normal to fast. This obviously was too fast for the unit. The fast path is exclusively used to set the oscillator pitch, cut-off frequency and amplitude in the 2kHz cycle. I think the basic quality of the firmware upgrade relies on this update frequency and it was necessary to create the fast path to manage that update 2000 times per second.

@el-folie: Increasing the wait time in the fast path may or may not fit into the 2kHz cycle. It certainly eats directly into the user interaction performance. Still, I'd propose to send you a version with an extended wait cycle for fast path just to see if that removes the glitches in your case. What do you think?

image-et-son commented 2 years ago

@image-et-son Should I try and refresh using .hex to see if my zero load of the pitch bender on power reset goes away?

Well, I no longer have that zero load problem, but even after the change I made (always restoring equal tempered tuning) you said you had situations in which the bender went dead which has the same root cause. Loading zero values instead of what was stored and even overwriting defaults (as in the case of the bender where the default is 5 (semitones)) must never occur, so I need to find the root cause. You're the one who is observing this. Always? Always in the same situation?

el-folie commented 2 years ago

@el-folie: Increasing the wait time in the fast path may or may not fit into the 2kHz cycle. It certainly eats directly into the user interaction performance. Still, I'd propose to send you a version with an extended wait cycle for fast path just to see if that removes the glitches in your case. What do you think?

First, thank you imogen and Matrix12x for explaining all the technical details, I´m learning all the time by your knowledge!

I think there are two persepectives to tackle the timing issues:

  1. The usual synth user view: "I just want to use this damn thing as it´s supposed to, but with the new GliGLi functions - so why does it glitch when it always had worked perfectly before with the old z80 and SCI OS..."

For this type of user (which probably will be a majority) a "safe" and stable OS version would be best, but with the tradeoff of a slower overall response, which then again would be contradictory to the goal of using a new CPU and a fresh, richly featured OS for a lightning fast & responsive user experience. Therefore I personally would prefer another view:

  1. The technically-inclined synth nerd view: "I want to experience the new CPU and OS in the best possible way, with the fastest responsiveness possible, for the nicest and most satisfying user experience that the new CPU refresh rates would allow. And therefore I´m willing to change a few chips to allow that".

For this type of user (which will be a minority, but we can assume that only technically-inclined would even look at alternative CPUs and OSs for a vintage synth) we can assume that he´d do anything to achieve the best performance of his instrument and use the most brutal fast OS he could find for a lightning fast & precise on spot synth action and experience.

So, I´d personally go with option 2 and with the new information about the "R on" timing of ICs I will change every single part in my P600 for the fastest part I can find. I want my P600 to be as performant as yours and Matrix12x´s. And I´d want you to develop the fastest OS possible for the P600, not being hindered by old ICs. Prescision and speed should be the ultimate goal, not mass compatibilty when most users wouldn´t even look at a new CPU and GliGLi OS. This is nerdy stuff and nerds WILL update their machine with new ICs to get the best performance.

@matrix12x I understand the cd4051 can be one culprit and I had a look at the data sheet again and it seems they are really really bad. I couldn´t find the term "R on" but assume in this case it´s really 350ns minimum (at 5V) and up to 700ns max. So it´s obvious that these ICs do not allow for fast performance. See attached an excerpt of the IC data. Another question would be, which other ICs are timing relevant or if just all logic ICs could be exchanged for their fastest ersatz types there are in IC land. If that´s a yes then I will definitely do that. MC14051BCP_data

el-folie commented 2 years ago

@matrix12x Do all "CD74HCT" types/denominations have the fastest "R on" timings?

image-et-son commented 2 years ago

if I understand the AVR timer logic correctly, in our case a WAIT(1) corresponds to 250ns (and then 4, 8 as multiples accordingly).

As a test I would try several variants:

I am non proposing to use that setting, but it could help to probe the limit. For the broader audience I could even think about offering two variants. But I am wondering: shouldn't there be users of version 2.0 out there complaining about glitches?

image-et-son commented 2 years ago

Just a little information from digging: if you take the three wait times in the fast path , we have from 2.1 RC3: (1|1|0). Look at the history:

The Stable Release 2.0 was posted by GliGli on May 25th 20214, so it's hard to tell if it uses (1|1|0) or (1|1|1) or (1|2|2).

In any case, it shows that the "stability" evolves around choosing the optimal setting here and GliGli spent time experimenting with this. @el-folie: do you have the glitches in version 2.0?

el-folie commented 2 years ago

if I understand the AVR timer logic correctly, in our case a WAIT(1) corresponds to 250ns (and then 4, 8 as multiples accordingly).

Okay, that´d mean that 350ns as the minimum like with my CD4051s is too slow anyway, right? In that case it´s no wonder I´m getting glitches.

el-folie commented 2 years ago

In any case, it shows that the "stability" evolves around choosing the optimal setting here and GliGli spent time experimenting with this. @el-folie: do you have the glitches in version 2.0?

I´ll check it and report back...

el-folie commented 2 years ago

Definitely I´m getting glitches with OS stable 2.0!

Not as drastic as with the new alphas, but definitely also pitch glitches/double triggering of pitches like C6-C6, where it should be C6-C1-C6 and so on (tried with the ASSGN arp as there´s no seq on stable 2.0). And this was my experience from the beginning of using GliGLi OSs. I always wondered if my P600 has hardware faults - now I know it´s (hopefully) just the timing relevant ICs, they are too slow for a modern CPU and OS.

Oh, and of note, I got a stuck spinning rectangle on first try of reflashing by syx to stable 2.0, with the last "test RC14" still residing in teensy memory. On the second try I got an "E". On the third try I got nothing anymore. So for the fourth try I reflashed by hex and all went fine.

image-et-son commented 2 years ago

if I understand the AVR timer logic correctly, in our case a WAIT(1) corresponds to 250ns (and then 4, 8 as multiples accordingly).

Okay, that´d mean that 350ns as the minimum like with my CD4051s is too slow anyway, right? In that case it´s no wonder I´m getting glitches.

Yes, but you also have the code surrounding it which also takes time. Hard to tell. It seems at the edge and the experiments around it support that. BTW: I looked up some discussions on the topic in the forum but it is hard to reconcile what people are reporting, see https://gearspace.com/board/showpost.php?p=10414068&postcount=677

image-et-son commented 2 years ago

Oh, and of note, I got a stuck spinning rectangle on first try of reflashing by syx to stable 2.0, with the last "test RC14" still residing in teensy memory. On the second try I got an "E". On the third try I got nothing anymore. So for the fourth try I reflashed by hex and all went fine.

In the new user manual I included the hex procedure, because I think everybody should know that, not only engineers. Had to put in a huge disclaimer though because without removing the Teensy board you need to open the unit once with power on to press the button on the Teensy, so it's a critical thing to recommend to people...

el-folie commented 2 years ago

if I understand the AVR timer logic correctly, in our case a WAIT(1) corresponds to 250ns (and then 4, 8 as multiples accordingly).

Okay, that´d mean that 350ns as the minimum like with my CD4051s is too slow anyway, right? In that case it´s no wonder I´m getting glitches.

Yes, but you also have the code surrounding it which also takes time. Hard to tell. It seems at the edge and the experiments around it support that. BTW: I looked up some discussions on the topic in the forum but it is hard to reconcile what people are reporting, see https://gearspace.com/board/showpost.php?p=10414068&postcount=677

I understand - in that case it´d really be the best idea to renew all logic/timing ICs to the fastest ones I can get. I will search for those and will do that, otherwise my testing and contribution to this project would only suffer, well unless we need a good negative example of P600s for the lower limits of speed optimizations.

el-folie commented 2 years ago

Oh, and of note, I got a stuck spinning rectangle on first try of reflashing by syx to stable 2.0, with the last "test RC14" still residing in teensy memory. On the second try I got an "E". On the third try I got nothing anymore. So for the fourth try I reflashed by hex and all went fine.

In the new user manual I included the hex procedure, because I think everybody should know that, not only engineers. Had to put in a huge disclaimer though because without removing the Teensy board you need to open the unit once with power on to press the button on the Teensy, so it's a critical thing to recommend to people...

One thing also to be mentioned is that the USB plug usually won´t fit in with the teensy sitting right on the old CPU socket as the battery is in the way. I solved this by soldering the teensy onto a precision socket and by stucking two precision sockets to the soldered socket to get the USB socket higher so that the USB plug then sits slightly above the battery. A bit of extra work but worth it to be able to use hex with the teensy residing on the mainboard. (Or users need to remove the battery - which then doesn´t allow to use z80 and presets again.)

image-et-son commented 2 years ago

One thing also to be mentioned is that the USB plug usually won´t fit in with the teensy sitting right on the old CPU socket as the battery is in the way. I solved this by soldering the teensy onto a precision socket and by stucking two precision sockets to the soldered socket to get the USB socket higher so that the USB plug then sits slightly above the battery. A bit of extra work but worth it to be able to use hex with the teensy residing on the mainboard. (Or users need to remove the battery - which then doesn´t allow to use z80 and presets again.)

I precisely documented that problem and those two options + finding a different spot for the battery.

el-folie commented 2 years ago

Yes, but you also have the code surrounding it which also takes time. Hard to tell. It seems at the edge and the experiments around it support that. BTW: I looked up some discussions on the topic in the forum but it is hard to reconcile what people are reporting, see https://gearspace.com/board/showpost.php?p=10414068&postcount=677

Just read it - so wait states and "R on" of relevant ICs together form the boundaries whithin the system works properly or does not work. I see. And per the gearspace comment even tuning problems can be related to timing relevant ICs and/or code functions where normally one would just think "oh no, a hardware defect again...". Fascinating and important to know.

image-et-son commented 2 years ago

I guess those "tuning problems" are in fact some effects of imprecise or spilled voltages which make pitches jittery (ie. "the glitches"), not tuning in the narrower sense. The tuning functionality uses the voltage setting of the "normal path". But, as I said, I found it hard to understand what people were observing. The discussion was not very coherent...

el-folie commented 2 years ago

I guess those "tuning problems" are in fact some effects of imprecise or spilled voltages which make pitches jittery (ie. "the glitches"), not tuning in the narrower sense. The tuning functionality uses the voltage setting of the "normal path". But, as I said, I found it hard to understand what people were observing. The discussion was not very coherent...

Yes, descriptions are not always easy to understand when it´s about uncommon effects, not everyone concentrates on the same aspects of the same effects, so communication is always the key...

For the last few hours I tried to wrap my head around IC technology (CMOS, TTL, LS, HC(T), etc.) to understand what I could use in the P600 for better speed, or better put what not to use in order to not make the z80/old memory addressing/timing unusable. I understand there´s a difference between old TTL logic control signals being higher in voltage and modern logic ICs might not like them on their inputs. In the long run I´d like to find a set of chips for all logic functions on the main board and for the multiplexers on the analog board that would allow for the fastest possible control speeds. An IC table like that could also be important for future users of optimized OSs. The question also is, could much faster response times interfere with z80 or teensy++ instruction executions in a detrimental way, meaning, could logic ICs also be too fast to work properly with deliberately programmed cpu wait states to execute certain functions (and similar). A propagation time difference of several hundred ns might suggest that - but it might depend on where the IC sits in the circuit, what it does as a function. I guess, like Matrix12x mentioned, the 74HC(T)4051 might be in the perfect spot at the last point in the execution chain to distribute DAC signals to all synth voices, so where speed matters the most. But I´m also thinking about what´s before all that, that could influence the speed of the data flow, like the 4049 UBE buffers, the 4174 flipflops/4013 switches in front of the DAC. Could all of those be exchanged for ultra fast IC types? And maybe also all other logic ICs on the main board starting right after the CPU socket (74LS138)? Or could that cause other timing trouble? A little voice in my head says "Yes - don´t touch the main board, z80/old memory/MIDI might freak out". But of course I can´t know. ;-)

But generally this topic is extremely interesting and it would be absolutely great if we´d find a chip set that´d give every P600 the opportunity to become lightnig fast in response without exhibiting any glitches.

matrix12x commented 2 years ago

@image-et-son Great news, I re-flashed using the .hex file and there is no zero load issue. I literally can't get it to happen. (good)

It properly stores and recalls pitch bend in manual (panel) mode when I switch to/from preset mode and when I turn on and off power. That issue I think is resolved.

matrix12x commented 2 years ago

@el-folie based on my digital design experience, I don't see how using faster chips in these specific spots (the multiplexers and the buffers) could possibly cause an issue.

The original CD chips are crap. I think they are "rated" for 1MHz max, and the 74 series can go to 25MHz. That means sharper edges on waveforms. The CD can only drive like 1mA, the 74 can drive like 4mA. 74 has multitudes better propagation timing. etc.

When looking at new chips to use, watch out for the max voltage the input pins (not just VCC) can take. Many types of chips can't go over 4.5V (or 3.3V) and we need to go up to 5V. I personally have used the 74HCT4051 and I know that works. so we have at least one datapoint.

Off the top of my head I didn't remember the diff between HC and HCT. Although they cover it here: http://www.elecdude.com/2014/07/differences-in-cmos-4000-series-74ls-74hc-74hct.html

Looks like for our purposes HC and HCT should both work.

BTW the CD4049 (U401) has a max propagation delay of about 120nS. Although the 74HCT4049 has about an 8-10nS propagation delay.

It would be really interesting to add up the total propagation delays for the CD series chips relating to the CVs. Although I'm not sure if the total number matters as much as how long it takes the 4051 to switch from output to output.

I was just thinking about the CD4067 chips that are used to multiplex the control pots. I wonder if switching these to 74HCT4067 would help responsiveness.

image-et-son commented 2 years ago

Hi, to find out more about the interplay of software and hardware I have produced three debug versions. These have different wait time configurations in the "fast path": (1|1|0), (1|1|1), (1|2|2). @el-folie : if any of these configuration either eliminate or at least reduce the glitches on your P600 we can be sure we understand what is happening and also we see where the limits are. You can see the respective configuration in the ending of the files names.

The versions are really DEBUG versions: they send out a count of the synth_update() per second via MIDI in the guise of a pitch bend on channel 1. (MIDIOX is very handy because it reads the full 8 bits of the MIDI data bytes even if MIDI would only allow 7 bits). You know that wait times in the voltage setting eat into the synth_update() frequency which basically determines the responsiveness of the user control. On my machine I have the following basic frequency (e.g. without playing):

I previously found that you need something close to 200 in order to perceive everything as "smooth". I still have some ideas how to boost the frequency - after my last attempt (switching the "static voltages" to "fast path") produced glitches and had to be reversed. One thing I have discovered while looking at the update frequency is that it goes down dramatically during attack and decay/release phases, meaning that the ADSR lookup appears to be computationally expensive, which surprised me. In the sustain phase the update frequency is almost the same as not playing at all. I guess switching the linear shape to a lookup has had the side effect of reducing the responsiveness in that setting during long decay/release phases...

p600firmware_alpha14_debug-test_xxx.zip

el-folie commented 2 years ago

p600firmware_alpha14_debug-test_xxx.zip

Hi, doing these tests I feel like a kid again being back in the basement lab of a friend´s father in the early 80s, stacked with expensive machines and equipment we weren´t even allowed to touch (but did so anyway, haha), great memories.

Here are the results of my machine (in decimal values from MIDIOX):

(1|1|0) 100-108: power on, idling 40-42: unison seq max speed (rare glitches, but def when tweaking a knob) 20: unison seq max speed + 1 knob tweaking (glitches)

(1|1|1) 99-100: power on, idling 33: unison seq max speed (glitches, getting more intense on knob tweaking) 18: unison seq max speed + 1 knob tweaking (glitches)

(1|2|2) 81-85: power on, idling 20: unison seq max speed (glitches, getting more intense on knob tweaking) 13: unison seq max speed + 1 knob tweaking (glitches)

Switching the running seq from 6-v-unison to 1-v-unison leads to higher refresh rate than switching to poly, possibly because of the 6 release times computation. Also, I got a refresh rate difference with and without MIDI IN still getting info from the loop via MIDIOX, so for the above exact refresh rate tests I only connected P600 MIDI OUT to MIDIOX MIDI IN. With the MIDI loop being connected the refresh rates were a few values lower.

image-et-son commented 2 years ago

So you wouldn't say that extending the wait times does any real good for the glitches?

el-folie commented 2 years ago

So you wouldn't say that extending the wait times does any real good for the glitches?

In my case it´s hard to say as per the above results. Also important to note that while testing the machine warmed up, so I should retest with a stone cold machine in a few hours again - maybe then the results are different.

Generally I think I should order those fast HC(T) multiplexers to see if it changes something. There must be a reason your machines are running smoothly and mine is not. Matrix12x said the old spec CD4051 ICs are crap, maybe that´s a good starting point...

el-folie commented 2 years ago

So you wouldn't say that extending the wait times does any real good for the glitches?

BTW what´s the denomination of your 4051 ICs?

el-folie commented 2 years ago

@el-folie based on my digital design experience, I don't see how using faster chips in these specific spots (the multiplexers and the buffers) could possibly cause an issue.

The original CD chips are crap. I think they are "rated" for 1MHz max, and the 74 series can go to 25MHz. That means sharper edges on waveforms. The CD can only drive like 1mA, the 74 can drive like 4mA. 74 has multitudes better propagation timing. etc.

When looking at new chips to use, watch out for the max voltage the input pins (not just VCC) can take. Many types of chips can't go over 4.5V (or 3.3V) and we need to go up to 5V. I personally have used the 74HCT4051 and I know that works. so we have at least one datapoint.

Off the top of my head I didn't remember the diff between HC and HCT. Although they cover it here: http://www.elecdude.com/2014/07/differences-in-cmos-4000-series-74ls-74hc-74hct.html

Looks like for our purposes HC and HCT should both work.

BTW the CD4049 (U401) has a max propagation delay of about 120nS. Although the 74HCT4049 has about an 8-10nS propagation delay.

It would be really interesting to add up the total propagation delays for the CD series chips relating to the CVs. Although I'm not sure if the total number matters as much as how long it takes the 4051 to switch from output to output.

I was just thinking about the CD4067 chips that are used to multiplex the control pots. I wonder if switching these to 74HCT4067 would help responsiveness.

Nice roundup and link, thank you!

image-et-son commented 2 years ago

So you wouldn't say that extending the wait times does any real good for the glitches?

BTW what´s the denomination of your 4051 ICs?

Mine read "CD4051BE / RCA227".