sinara-hw / sinara

Sayma AMC/RTM issue tracker
Other
42 stars 7 forks source link

Sayma in uTCA: no power #475

Closed jbqubit closed 5 years ago

jbqubit commented 6 years ago

@sbourdeauducq and I both see a problem with Sayma in uTCA rack. Take a board that works fine on the desktop powered via external supply and put it into uTCA rack. JTAG over USB stops working. @gkasprow do you see this?

gkasprow commented 6 years ago

@hartytp is the blue LED blinking on the front panel? Did you enable the power supply in the crate?

hartytp commented 6 years ago

I think you mean @jbqubit

jbqubit commented 6 years ago

@gkasprow asked

is the blue LED blinking on the front panel

The power for the AMC slot is enabled and the MCH power led is blinking green. When I run Sayma_AMC and Sayma_RTM on the desktop in my lab using stand-alone power supply I observe 3.3 A current. In uTCA crate the current is only 1.85 A (as reported by MCH).

jbqubit commented 6 years ago

And the blue LED is not blinking.

gkasprow commented 6 years ago

@jbqubit can you check if LEDs on Sayma AMC, close to the DC/DC converter are ON? The same with RTM LEDs. It looks like the RTM is not enabled. Maybe it is not fully plugged?

gkasprow commented 6 years ago

anyway, not initialised Sayma should consume roughly 1.9..2A

jbqubit commented 6 years ago
gkasprow commented 6 years ago

so if the power LEDs are not on, so who consumes that much power...I will check it with my crate.

jbqubit commented 6 years ago

Were you able to reproduce @gkasprow?

jbqubit commented 6 years ago

@gkasprow Were you able to reproduce this?

jbqubit commented 6 years ago

Changed name to "Sayma in uTCA: no power" as this better reflects the underlying problem.

jbqubit commented 6 years ago

I've shorted pins 13 and 11 of JTAG header as in https://github.com/m-labs/sinara/issues/463. But no improvement.

sbourdeauducq commented 6 years ago

@gkasprow Have you reproduced/looked into this issue? Ken Brown is asking when the boards will work in µTCA.

gkasprow commented 6 years ago

We finally managed to collect all required HW in the lab. NAT MCH and the supply were lent to other group. Jakub is working on it right now trying to make OpenMMC running on Sayma. So expect progress very soon. 2018-01-24 11 51 17

sbourdeauducq commented 6 years ago

Porting OpenMMC in a short time sounds a bit ambitious. Maybe there is a quick hack that can be done to get the power supply running?

gkasprow commented 6 years ago

We port only necessary things, power supply and sensor readout. All remaining stuff is not really useful for ARTIQ. The original processor for which it was developed differs mainly by package.

gkasprow commented 6 years ago

Jakub ported the MMC to new CPU, tomorrow we will try to make it running.

gkasprow commented 6 years ago

MMC works, we still need to resolve some minor issues, but the MMC gets power. Hot swap also works. Tested with NAT and Vadatech MCH.

jbqubit commented 6 years ago

Does Sayma AMC + RTM also power up? Can you see startup ARTIQ/MiSoC UART messages on AMC?

gkasprow commented 6 years ago

Didn't check yet with RTM. The AMC wakes up, all supplies are on, start-up messages are on the USB UART.

gkasprow commented 6 years ago

RTM wakes up as well.

jbqubit commented 6 years ago

Congrats on getting the OpenOCD up and running on Sayma AMC! That's a big step toward the OpenMMC milestone.

As of this Issue now there's two versions of MMC software. One contains many months of configuration tweaks that got Sayma working (Forth). The second version now only contains these uTCA-power configuration tweaks (OpenMMC). Can you back port the uTCA-power tweaks to the Forth code so we can commence with testing with rack? I agree with @sbourdeauducq that it's better to reduce dependency on OpenOCD port for purposes of Sayma v1 milestone.

jbqubit commented 6 years ago

Once added to Forth MMC firmware and confirmed to work by @sbourdeauducq let's close this Issue.

sbourdeauducq commented 6 years ago

OpenOCD? Forth? This has close to nothing to do with the MMC.

gkasprow commented 6 years ago

Forth was used to play with FPG peripherals. Jakub already integrated my init procedures with OpenMMC, now he is testing them

jbqubit commented 6 years ago

OpenOCD? Forth? Typo. I mean OpenOCD.

OK. Please use OpenMMC milestone and related Issues to track progress of Jakub's work.

dhslichter commented 6 years ago

OpenOCD? Forth? Typo. I mean OpenOCD.

You mean OpenMMC? :)

jbqubit commented 6 years ago

OK. To test I need to know how to program OpenMMC firmware. cf https://github.com/m-labs/sinara/issues/428

gkasprow commented 6 years ago

@jbqubit are you able to get the FlashMagic software running on your local machine?

jbqubit commented 6 years ago

Yes, I have FlashMagic on a Windows machine.

gkasprow commented 6 years ago

OK. Short pin 2 of T23 with pin 2 of T24. This will enable reset over USB Please connect USB to the computer. In device manager/ Ports (COM&LPT) check which USB port is FTDI port D Run FlashMagic tool, select correct COM port. I will place *hex file here so you could test if it works.

gkasprow commented 6 years ago

set LPC1776, 8MHz oscillator, select hex file and press start.

jbqubit commented 6 years ago

Which .hex file?

gkasprow commented 6 years ago

I'm trying to generate it. I have it configured on another computer. I need to get hex file from axf file.

gkasprow commented 6 years ago

OK, this should work: to convert axf to FlashMagic compatible format I used this command. I post it for further reference.

arm-none-eabi-objcopy -O binary "${BuildArtifactFileName}" "${BuildArtifactFileBaseName}.bin"
checksum -v "${BuildArtifactFileBaseName}.bin"
arm-none-eabi-objcopy -I binary "${BuildArtifactFileBaseName}.bin" -O ihex "${BuildArtifactFileBaseName}.hex"

Please use this file lpc1776_ethernet_I2C.zip

sbourdeauducq commented 6 years ago

You need only one command. https://github.com/m-labs/sinara/issues/448#issuecomment-359141733

jbqubit commented 6 years ago

I received replacement AMC from @gkasprow this week. Board powers on fine on bench top when connected to RTM. The Blue front panel LED is not blinking. @gkasprow says this is proper behavior of board in uTCA crate -- MMC firmware on this AMC is now more standards compliant. OK good.

@gkasprow @wizath But...... For this AMC + RTM in uTCA rack... I see fast flashing of blue, red, green lights on front panel. No power LEDS are illuminated on RTM or AMC. ​

jbqubit commented 6 years ago

@gkasprow @wizath On your board can you ping Sayma using MCH as switch?

gkasprow commented 6 years ago

@jbqubit we have troubles getting ARTIQ running on the only AMC board we have. On Tuesday I will receive your board so we will have more chances to debug it. To diagnose Ethernet the list of the things to check are:

If the power doesn't work in the crate, just check what MMC and MCH says

sbourdeauducq commented 6 years ago

I flashed your openMMC_exar.axf and updated the Exar chips. The Sayma appears to work fine outside the µTCA crate. But now the whole µTCA crate dies (no more MCH response to TCP/IP) when a Sayma is inserted and the hotswap switch closed, with the following final MCH messages:

ScanPM: ...
no PM for fru_id=51
AMC2(6): Handle=0x01 - closed
no PM for fru_id=52
LSHM(0): FRU 6 sensor 38 LUN 0 'HS 006 AMC2' hotswap M1->M2
LSHM(0): FRU 6 sensor 38 LUN 0 'HS 006 AMC2' hotswap M1->M2
Activation: modules are ready
Activation: all modules ready, Allowance Period (20 sec) stopped - continue with module startup !
Activation: starting AMC2
bp_fru: power chan=6, current limit=8.0 A
PwrAllocate(fru 6 chan 6): allocating 8.0 A granted (available 44.5 A)
pm_EnablePayloadPwr(50,6):
PM(50,6):Enable PP (primSite:1 secSite:0)
pm_PP_good(6): Failure: power channel state=0x0b
pm_PP_good(6): Failure: power channel state=0x0b
sbourdeauducq commented 6 years ago

And of course another board produces different behavior:

ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): REQ(I2C=0x74) failed on bus 2 - no ACK
R(6,2,2)pm_processEvent PM(50): channel(1) new state=0x5b
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8
ipmiMsgSender(6): RSP(I2C=0x74) failed on bus 2 result -8

...and no MCH dying.

sbourdeauducq commented 6 years ago

Sometimes I get yet another error:

Activation: modules are ready
Activation: all modules ready, Allowance Period (20 sec) stopped - continue with module startup !
Activation: starting AMC3
bp_fru: power chan=7, current limit=8.0 A
PwrAllocate(fru 7 chan 7): allocating 8.0 A granted (available 45.5 A)
pm_EnablePayloadPwr(50,7):
.CU1(40): FRU active (state M4)
PM(50,7):Enable PP (primSite:1 secSite:0)
pm_PP_good(7): Failure: power channel state=0x0b
pm_PP_good(7): Failure: power channel state=0x0b
CU1(40)  fan speed properties:
  minimum speed level:    0x00
  maximum speed level:    0x0f
  normal operating level: 0x03
  fan tray properties:    0x80
PM(50,7): new state=0x1b
pm_processEvent PM(50): channel(3) new state=0x1b
pm_processEvent PM(50): channel(7) new state=0x1b
...LSHM(0): FRU 7 sensor 38 LUN 0 'HS 007 AMC3' hotswap M2->M3
LSHM(0): CU1 FRU 40 added
LSHM(0): FRU 7 sensor 38 LUN 0 'HS 007 AMC3' hotswap M2->M3
LSHM(0): FRU 40 sensor 55 LUN 0 'Hot Swap' hotswap M3->M4
LSHM(0): FRU 0 sensor 32 LUN 0 'HS 000 CM' hotswap M3->M4
........
pm_PwrFree(3)
bp_fru: power chan=1, current limit=8.0 A
PwrAllocate(fru 3 chan 1): allocating 1.0 A granted (available 37.5 A)
pm_EnablePayloadPwr(50,1):
PM(50): payload already ON for FRU 3
PM(50,1):Enable PP (primSite:1 secSite:0)
WARN - LSHM(0): ignore version change sensor
LSHM(0): CM sensor 59 LUN 0 <unknown> hotswap M1->M2
LSHM(0): FRU 3 sensor 66 LUN 0 'HotSwap' hotswap M2->M3
LSHM(0): FRU 3 sensor 66 LUN 0 'HotSwap' hotswap M3->M4
pm_processEvent PM(50): channel(1) new state=0x5b
LSHM(0): FRU 7 sensor 38 LUN 0 'HS 007 AMC3' hotswap M3->M4
LSHM(0): FRU 7 sensor 38 LUN 0 'HS 007 AMC3' hotswap M3->M4
ScanPM: ...
no PM for fru_id=51
no PM for fru_id=52
no PM for fru_id=53
LSHM(0): FRU 40 sensor 52 LUN 0 '+12V' voltage 'lower non-critical go low' -assertion 
LSHM(0): FRU 40 sensor 52 LUN 0 '+12V' voltage 'lower critical go low' -assertion 
LSHM(0): FRU 40 sensor 52 LUN 0 '+12V' voltage 'lower non-recoverable go low' -assertion 
LSHM(0): FRU 40 sensor 51 LUN 0 '+12V_1' voltage 'lower non-critical go low' -assertion 
LSHM(0): FRU 40 sensor 51 LUN 0 '+12V_1' voltage 'lower critical go low' -assertion 
LSHM(0): FRU 40 sensor 51 LUN 0 '+12V_1' voltage 'lower non-recoverable go low' -assertion 

But, whatever happens, Sayma never gets powered.

sbourdeauducq commented 6 years ago

This later error turns off the fans completely (and then parts of the super-reliable µTCA crate overheat), but the MCH still responds to TCP/IP.

vdirksen commented 6 years ago

@sbourdeauducq : Please start the web interface of the NAT-MCH. If you have not changed the IP address, then you can use "firefox 192.168.1.41". Login if not changed is "root" without quotes and password "nat" without quotes. Scroll down to "System information". Click on this and then let it create a text file. Please send that system file to support@nateurope.com. We can see then the history and also all components in the system. We will try to support you here.

hartytp commented 6 years ago

@gkasprow I gather this is fixed now, but please reopen if it's not.

sbourdeauducq commented 6 years ago

@hartytp No commits to https://github.com/m-labs/mmc-firmware so obviously this isn't fixed.

hartytp commented 6 years ago

Hmm Greg reported having BP Ethernet working on Sayma in a uTCA rack. So I think the issue is fixed in his code. Not having the code uploaded to the repo sounds like a separate issue to me.

jbqubit commented 6 years ago

Not having the code uploaded to the repo sounds like a separate issue to me.

Given the amount of frustration over this topic and the several failed past attempts at remedy I'd like to keep this open until it's confirmed by an end user.

gkasprow commented 6 years ago

@wizath where is the code?

sbourdeauducq commented 6 years ago

@gkasprow @wizath Did you get Sayma running without power supply or fan-related bugs in the µTCA crate? Did you test it very carefully, as the behavior I observed was non-deterministic? If so, can we get updated OpenMMC firmware images?