sinara-hw / sinara

Sayma AMC/RTM issue tracker
Other
42 stars 7 forks source link

µTCA: no power to Sayma when RTM is plugged #571

Closed sbourdeauducq closed 5 years ago

sbourdeauducq commented 6 years ago

Blue and red LEDs on Sayma AMC blink rapidly, and all voltage LEDs on Sayma AMC are off.

gkasprow commented 6 years ago

Does it start with AMC only?

sbourdeauducq commented 6 years ago

Yes.

gkasprow commented 6 years ago

did you try with another RTM?

gkasprow commented 6 years ago

And is it the AMC I shipped you recently? Rapid blinking looks like overcurrent issue that was fixed by slow start circuits added to recent hardware.

gkasprow commented 6 years ago

And please test it in leftmost and rightmost slots.

sbourdeauducq commented 6 years ago

And is it the AMC I shipped you recently?

Yes. It has the slow-start circuit rework. Previously, AMCs never started at all in the crate. The RTM powers up when using the ATX supply.

gkasprow commented 6 years ago

@wizath do you remember if some particular MCH settings were necessary to make power working?

gkasprow commented 6 years ago

@sbourdeauducq I'm pretty sure you have to setup current limit on your MCH.

jbqubit commented 6 years ago

@gkasprow @wizath This is super annoying! @sbourdeauducq is now experiencing the same type of power-on problems that caused me to send boards back to WUT middle of June. If there are configuration steps needed for MCH please document on sinara-hw/sayma.

wizath commented 6 years ago

I don't think there is any configuration to do - since we used @jbqubit's MCH and I didn't change anything.

jbqubit commented 6 years ago

I've made modifications to the default behavior of MCH. I know I've discussed this with @gkasprow. The power-on behavior is set in a configuration file uploaded to the MCH. Here's the default that ships with NAT Native-R6. Here's a version that I've used in the past but not validated on functioning boards. Given what we know now about inrush current, using a one second delay between power-on of each AMC slot seems wise.

gkasprow commented 6 years ago

@sbourdeauducq could you post logs from MMC and MCH?

marmeladapk commented 6 years ago

It's likely that we fixed this issue, at least on one Sayma, the one marked with "F" on an orange sticker.

gkasprow commented 6 years ago

@marmeladapk ARAIF it was issue related with no power in the crate. Not really related with RTM

marmeladapk commented 6 years ago

Sometimes only AMC gets power, RTM doesn't start.

sbourdeauducq commented 6 years ago

@gkasprow / @marmeladapk Did you find a solution?

gkasprow commented 6 years ago

Jakub will send you update with disabled RTM power negotiation. In a few days we will get brand new MCH so will be able to recreate the issue.

sbourdeauducq commented 6 years ago

Any progress on this?

wizath commented 6 years ago

fw_forced_rtm.tar.gz

Power level switch is set to 0x01 after enabling power for AMC, so RTM boots straight after.

sbourdeauducq commented 6 years ago

Nope, that doesn't work. Same situation as before: There is power when the AMC alone is plugged in, but everything shuts down then the RTM is also plugged. Have you tested it?

wizath commented 6 years ago

Tested on NAT PSU+MCH, RTM gets power just after AMC. But this is Vadatech crate and fixed boards with current limiters.

sbourdeauducq commented 6 years ago

Does the RTM need a current limiter too? It doesn't work either with the RTM I received from Technosystem today, on which all known reworks are supposed to be applied.

gkasprow commented 6 years ago

Bran new PSU is on the way. We will recreate the problem with it.

gkasprow commented 6 years ago

We've just received brand new NAT crate, 2 MCHs, 4 PSUs so have all we need to recreate the problem.

gkasprow commented 6 years ago

The boards you received work perfectly with NAT PSU and MCH. Please post here what MCH and PSU say in logs. It looks like you have to adjust current limit in MCH.

sbourdeauducq commented 6 years ago

I don't know how you can expect anything other than IPMI snafu...

ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ScanPM: ...
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
no PM for fru_id=51
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
no PM for fru_id=52
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
no PM for fru_id=53
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ScanPM: ...
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
no PM for fru_id=51
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
no PM for fru_id=52
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
no PM for fru_id=53
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
AMC1(5): Communication regained !
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): REQ(I2C=0x72) failed on bus 1 - no ACK
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
sbourdeauducq commented 6 years ago

Sometimes there is this too:

ipmi_SendFru(5): timeout - no response for REQ: 0x20->0x72, Seq=3 GET_DEVICE_ID_REQ
PM(50,5): FRU 5 - de-assert #ENABLE
PM(50,5): Deassert #ENABLE
PM(50,5): FRU 5 - assert #ENABLE
PM(50,5): Assert #ENABLE
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
sbourdeauducq commented 6 years ago

If I unplug the RTM then the mess stops (and the AMC gets powered):

ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ScanPM: ...
no PM for fru_id=51
no PM for fru_id=52
no PM for fru_id=53
AMC1(5): Communication regained !
AMC1(5): AMC needs 8.0 Amps power
....
AMC1(5): Handle=0x01 - closed
bp_fru: power chan=5, current limit=8.0 A
PwrAllocate(fru 5 chan 5): allocating 8.0 A granted (available 44.5 A)
pm_EnablePayloadPwr(50,5):
LSHM(0): CM sensor 68 LUN 0 <unknown> hotswap M0->M1
LSHM(0): CM sensor 68 LUN 0 <unknown> hotswap M1->M2
LSHM(0): FRU 5 sensor 67 LUN 0 'HS 005 AMC1' hotswap M2->M3
LSHM(0): FRU 5 sensor 67 LUN 0 'HS 005 AMC1' hotswap M1->M2
LSHM(0): FRU 5 sensor 67 LUN 0 'HS 005 AMC1' hotswap M2->M3
PM(50,5):Enable PP (primSite:1 secSite:0)
pm_PP_good(5): Failure: power channel state=0x0b
pm_PP_good(5): Failure: power channel state=0x0b
PM(50,5): new state=0x1b
pm_processEvent PM(50): channel(5) new state=0x1b
LSHM(0): FRU 5 sensor 67 LUN 0 'HS 005 AMC1' hotswap M3->M4
LSHM(0): FRU 5 sensor 67 LUN 0 'HS 005 AMC1' hotswap M3->M4
ScanPM: ...
no PM for fru_id=51
no PM for fru_id=52
no PM for fru_id=53
ScanPM: ...
no PM for fru_id=51
no PM for fru_id=52
no PM for fru_id=53

and restarts as soon as I hotplug it:

ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
ipmiMsgSender(5): RSP(I2C=0x72) failed on bus 1 result -8
sbourdeauducq commented 6 years ago

The behavior is the same with the boards you sent, and those I had (after reflashing).

gkasprow commented 6 years ago

I'm back from holidays. I plugged the AMC board to bran new uTCA crate and AMC works, but when I plug the RTM, all AMC LEDs blink quickly. So it looks like we have similar issue here.

sbourdeauducq commented 6 years ago

Any progress fixing this?

gkasprow commented 6 years ago

@vdirksen I found that when I plug the power module to the right most slot (PM4) in the chassis, it gets power. When I plug to any of left slots (PM2), it does not get when RTM is installed. The LEDs on AMC are blinking quickly.

I tried with 2 MCH (basic and PHYS) and 3 PM AC600D modules. In PM1 and PM3 slot the MCH wakes up and then does nothing saying:

pm_PwrFree(fru 3): no power module assigned to power channel 1 PwrAllocate(fru 3): no power module assigned to power channel 1 mcmc(3): Power allocation failed, cannot advance to M4

So @sbourdeauducq please insert PM to slot 4 and try. In the mean time I will investigate what is going on

sbourdeauducq commented 6 years ago

There is only one power supply slot in the chassis I have.

gkasprow commented 6 years ago

Funny thing, I got same symptoms when plug the RTM loopback...There is almost nothing on it

gkasprow commented 6 years ago

OK, known issue - MCH sometimes does not wakeup when USB plug is inserted

gkasprow commented 6 years ago

And PM needs to be inserted to right slot. PM slot assignment can be programmed by NATView application.

gkasprow commented 6 years ago

strange, I cannot recreate the problem any more. The only conclusion I came about is that the mTCA crate disconnects the management power for a moment which resets the MMC CPU. It is repeated a few times per second. It looks like the power consumption from 3.3V was too high.

hartytp commented 6 years ago

OK, known issue - MCH sometimes does not wakeup when USB plug is inserted

What's the deal here? Those NAT MCHs were expensive, so why are they shipping with known bugs like this? Are they actually a high-quality component we can rely on, or just another shoddy over-priced piece of kit targeting physicists?

More generally, it strikes me as crazy that getting Sayma to power up in a rack is proving to be this much work. AFAICT, at least some of the blame lies with the NAT components we're using.

Since we've agreed to drop support for the LLRFBP in the next design revision, we can use a wide variety of uTCA AMC+RTM chassis from other vendors. Would it be quicker to find a higher-quality chassis + MCH from another vendor?

gkasprow commented 6 years ago

We also use MTCA stuff from Vadatech. They do not conform with MTCA specification and probably that's why some of the issues appeared with NAT. But now I have 2 NAT sets in the lab and can finally debug and solve the issues.

hartytp commented 6 years ago

@gkasprow okay, well that's good. It will make testing much easier if we all have a consistent setup, so we don't have to ask questions like "which PSU are you using" when we discover issues.

I would still like to know why NAT are shipping MCHs with known issues like "it doesn't work when you power it up with the USB plugged in". That doesn't give me much confidence in the quality of their hardware or testing.

gkasprow commented 6 years ago

You can always enable power to all modules, ignoring all the MMC issues, but I want to do it properly.

gkasprow commented 6 years ago

Yesterday I had visit of my colleague from CERN in my lab and he gave me some advices how to handle the NAT stuff. The equipment is good quality but can be configured in so many ways that cause confusion. The USB plug is probably foreseen for bootloader only afor advanced users and the main diagnostic and configuration tool is NatView software written in Java to runs on all platforms. I had no issues running it on WIndows. Diagnostics over usb is very limited. The same with PSU - their configuration in slots (main/redundant) is configurable. So that;s why not all combinations work. And that's why I was surprised by that fact. I have experience with Vadatech equipment which is much simpler and you can configure almost nothing. And the documentation is poor. And its quality is questionable. The fan trays broke several times, the same with MCH. I had to do reverse engineering of MCH to disable spread spectrum in PCIe hub :) NAT is very flexible and needs good understanding to use it properly. Once I manage to setup it and prove it works in my lab, will create a guide how to build and configure working setup.

sbourdeauducq commented 6 years ago

And PM needs to be inserted to right slot.

Again, the crate I'm using has only one slot, in the left.

written in Java NAT is very flexible and needs good understanding to use it properly.

Let's call it poor design. We just want to power up a few boards and spin up a couple fans, for God's sake.

You can always enable power to all modules, ignoring all the MMC issues,

You keep saying this, but I have not seen it demonstrated. When I attempted it, the whole crate shut down, but that was before adding the current limit to Sayma. Is it working now?

More generally, it strikes me as crazy that getting Sayma to power up in a rack is proving to be this much work

This seems to be standard fare with uTCA according to a few engineers I've talked to, and one main reason why I was opposed to using it.

gkasprow commented 6 years ago

@sbourdeauducq It worked for me with permanently enabled power supplies. Current limit that I added helped - before the inrush current was too high. So it should work in your crate too. Remember that you cannot make hotplug. Now I want to fully conform with NAT requirements because they follow MTCA specification strictly. We have MTCA systems installed in numerous places, but so far almost all of them use Vadatech crates. That's why there were no issues with supply. We switched to NAT due to several reasons and now it's my responsibility to make it working.

gkasprow commented 6 years ago

With 2 power modules, the AMC+RTM works properly. It looks like AMC is drawing too much current from 3.3V MP

gkasprow commented 6 years ago

The MTCA spec says that it should not exceed 150mA. Sayma is consuming far less. So maybe inrush current is an issue here. I observe on the scope that management power arrives, but after a few ms later starts falling down which causes MMC reset and the process repeats. Maybe 3.3VMP rail switch consumes too much during power on...

gkasprow commented 6 years ago

The spec says: The Power Modules shall support MP inrush current up to 270 mA for 200 ms to each AMC slot

gkasprow commented 6 years ago

OK, I know what is going on. The OpenMMC initialises all IOs immediately after boot. So we are in the first power stage where only management power is present. And since IOs are initialised, some of them have level H and cause current flow via protection diodes of several circuits, which raises P3V3 power rail to 1.5V. Of course this consumes much more power from 3V3 rail than MTCA allows. Spec says about 150mA while we get 180mA. I expect max 50mA. The fix is easy - change initialisation sequence in MMC so it initialises GPIOs after MMC negotiates power.

hartytp commented 6 years ago

Good catch!

gkasprow commented 6 years ago

The RTM consumes some current because I2C switch outputs have 2k2 pullups to 3V3MP which adds a few mA to the bill. And these a few mA are too much for 3.3V management rail.