cms-gem-daq-project / ctp7_modules

0 stars 13 forks source link

Bus error persists after reading disconnected SCA-ADC channels #119

Closed ram1123 closed 5 years ago

ram1123 commented 5 years ago

Brief summary of issue

One we access a disconnected channel then it gives bus error. And after that it starts giving error for the connected channels also.

Types of issue

Expected Behavior

This shows flowing error:

While running the command:

[rasharma@gem904daq01 bin]$ ./blaster eagle34 08 0x200
calling amc.readADCCommands on eagle34 with 8 and 512...done!
RPC ERROR:memsvc error: Bus error accessing 0x66c0400c

In log pannel it shows:

Apr 15 10:45:39 CTP7 local0.notice rpcsvc[5827]: rpcsvc: Client connected from 192.168.0.180
Apr 15 10:45:39 CTP7 local0.info rpcsvc[5827]: amc:  Memhub initialized a semaphore. Current semaphore value = 1
Apr 15 10:45:39 CTP7 local0.info rpcsvc[5827]: amc.readADCCommands: DEBUG: 3: Name of ADC Channel to read: 8
Apr 15 10:45:39 CTP7 local0.info rpcsvc[5827]: amc.readADCCommands: write memsvc error: Bus error accessing 0x66c0400c
Apr 15 10:45:39 CTP7 local0.info rpcsvc[5827]: amc.readADCCommands: DEBUG: 5: scaADCCommandOutput for OH0 = 0 

Current Behavior

Steps to Reproduce (for bugs)

  1. Checkout code from 834dd83675a3de1239ea6b431d761837fcda7007
  2. If you run the module readADCCommands. Then one should get
    ./blaster eagle34 08 0x200

Possible Solution (for bugs)

Context (for feature requests)

Your Environment

jsturdy commented 5 years ago

What is a "disconnected" ADC channel (and why are you trying to read them?)

The possible solution is to enable the current sink only for the temperature channels not for others.

1) This is not just a "possible solution", it is the operational mode that must be followed. 1) Additionally, the current source must only be enabled (if needed) for the channel being sampled, and then disabled when done.

ram1123 commented 5 years ago

@jsturdy

What is a "disconnected" ADC channel (and why are you trying to read them?)

By "disconnected", I mean the channels that are not used like sca_enums.h#L243-L245

jsturdy commented 5 years ago

@jsturdy

What is a "disconnected" ADC channel (and why are you trying to read them?)

By "disconnected", I mean the channels that are not used like sca_enums.h#L243-L245

OK, in this case, it might be good to comment out those enums that are not connected, so that it is not possible to enable the monitoring of them.

ram1123 commented 5 years ago

Hi @jsturdy ,

List of good ADCs are: 00, 04, 07, 08, 0E, 0F, 11, 12, 13, 15, 18, 1B, 1E and 1F. List of disconnected/bad ADC are: 01, 02, 03, 05, 06, 09, 0A, 0B, 0C, 0D, 10, 14, 16, 17, 19, 1A, 1C, and 1D

  • Can you reset the system and run testConnectivity.py and start fresh doing the following

    1. In a loop over "good" ADC channels (after looping over the "good" ADCs, select a "bad" ADC for the last iteration):

      1. read Imon in the DCS mainframe
      2. run test on ADC channel
      3. read Imon in the DCS mainframe

Performed this test and the value of Imon remains same (i.e. 1.85 A) before and after the test.

  1. Then do the following for each ADC channel (first all "good" ADCs, and then one-by-one a "bad" ADC):

    1. reset the system and run testConnectivity.py
    2. read Imon in the DCS mainframe
    3. run test on ADC channel
    4. read Imon in the DCS mainframe

Also, performed this test and the value of Imon remains same (i.e. 1.85 A) before and after the test.

  • log everything in an elog

Its logged in cms-elog-1082877

The possible solution is to enable the current sink only for the temperature channels not for others.

  1. This is not just a "possible solution", it is the operational mode that must be followed.
  2. Additionally, the current source must only be enabled (if needed) for the channel being sampled, and then disabled when done.
ram1123 commented 5 years ago

This issue was understood but forgot to reply.

The issue mentioned above was arised by the current sink. So, whenever we are reading the temperature channel then before reading it we need to enable the current sink and just after reading it we need to disable it. If we don't disable and try to read any other register then we will get the error saying

Bus error accessing XXXX

So, every time before reading the temperature channel we need to enable the current sink and disable it after reading.