KastnerRG / riffa

The RIFFA development repository
https://riffa.ucsd.edu
Other
776 stars 315 forks source link

RIFFA Driver fails on 32-bit ARM core with __aeabi_uldivmod during boot #9

Open drichmond opened 8 years ago

drichmond commented 8 years ago

Correspondence from RIFFA Users Mailing list:

In riffa_driver.c, can you find the method declaration for __udivdi3, and change the method header as seen below:

Currently: unsigned long long __udivdi3(unsigned long long num, unsigned long long den)

Suggested change: unsigned long long __udivdi3(unsigned long long num, unsigned long den)

After this, recompile the driver and reinstall. Afterward, send the output of dmesg | grep riffa

I am going to file this as a bug on github, and attach our correspondence

drichmond commented 8 years ago

Hi Dustin, Thank you very much for the quick reply. Here is the output.

ubuntu@tegra-ubuntu:~$ dmesg | grep riffa [ 10.575363] riffa: Unknown symbol __aeabi_uldivmod (err 0) [ 10.577102] riffa: Unknown symbol __aeabi_uldivmod (err 0)

Regards, Navoda

drichmond commented 8 years ago

Hi Navoda

What is the output when you run:

dmesg | grep riffa

Dustin

drichmond commented 8 years ago

I am a university student. For one of our projects, we are going to have a system where an Nvidia Tegra K1 works alongside an FPGA. We got down the development board for the Tegra K1 and it has a mini PCI connector. We are using a mini PCI to 1x PCI converter to connect it to the Xilinx VC707 FPGA board. The Tegra K1 chip (http://www.nvidia.com/object/tegra-k1-processor.html) has a Quad Core ARM-Cortex A15 processor on which we are running Ubuntu 14.04.1 LTS. I made simple loopback design using RIFFA for 1x lane width and tested it with the board plugged to a PC. It worked fine. Then i used the same design with our set up. i verified that the PCI device is recognized by the OS in the Tegra K1 using lspci command and received the following information.

"ubuntu@tegra-ubuntu:/dev$ sudo lspci -vv -s 01:00 01:00.0 Memory controller: Xilinx Corporation Device 7021 Subsystem: Xilinx Corporation Device 0007 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 130 Region 0: Memory at 32200000 (32-bit, non-prefetchable) [size=1K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [60] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s, Exit Latency L0s unlimited, L1 unlimited ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- Capabilities: [100 v1] Device Serial Number 00-00-00-01-01-00-0a-35"

However, when i used the RIFFA sample_app to check for the device in this way, "./testutil 0" it does not recognize the FPGA. "Error populating fpga_info_list"

The we thought the issue must be with the processor architecture since usually we run RIFFA on Intel x64 or x86.

Does anyone have any idea on what we can do? Pretty clueless actually.

Thank you in advance. Regards, Navoda

cospan commented 8 years ago

I doubt that this is the problem but I ran into an issue when interfacing my Spartan 6 board with the Nvidia TX1.

I have no problem communicating with my board using an x86 64bit intel box but when I tried it out with the Nvidia TX1 I would get strange behavior.

I discussed this with an EE friend and he asked if I was using the clock generated from the TX1 or a clock generated locally on my board.

I was using a clock generated on my board and he said there are some motherboards and SOCs that behave strangely when not using the same clock as PCIE root complex.

Apparently many PCIE cards take their clock from the root complex. It's cheaper not to have to generate a clock on their own so some SOCs and motherboards are not tested that thoroughly with an asynchronous clock.

If possible can you try and generate an image that uses the clock from the TK1? Unfortunately I don't have that option on my board so I can't try it out.

I have designed a new board using the clock generated by the root complex and after I get the RIFFA core to a point where I can exercise this I'll be able to see if it was the issue.

Dave

drichmond commented 8 years ago

I think this issue was solved in another way but I’m still trying to figure out why what was tried, worked.

The default now is to use the clock generated by the root complex (I believe)