Block configuration proposal (for discussion)

jsturdy commented 6 years ago

Quick summary

This issue is really cross-repo (depending on V2/V3 electronics, but the implementation will be the same in the end, modulo address differences and register differences)
It is also backwards incompatible, significantly moreso than the recent address table only changes
A healthy discussion and pro/con of various implementations needs to be done in order to fleshed out the best solution

State of the art

Currently, there are some registers that will make a total block type configuration impossible

A simple implementation (tested for read) would be to do (in uhal syntax), e.g.,

<node id=GEB.VFATS  address=0xwhatever>
<node id=BLOCK  address=0x0  mode="block" size=24*N VFAT rw regs permissions="rw"/>
<node id=VFAT0 address=0x0>
<node id=REG1 address=0x0 permissions="rw"/>
---
<node id=VFATChannels address=0xchanoff permissions="rw"/>
  <node id=BLOCK  address=0x0  mode="block" size=24*N VFAT rw regs permissions="rw"/>
</node> <!-- ends VFATChannels section -->
...
</node> <!-- closes GEB.VFATS node -->

One can further move the BLOCK node (or add multiple appropriately named and placed nodes) inside the VFATX node, as the top node, do block read/write settings for just that VFAT
However, due to the current organization as above, this will make the BLOCK node fail
- The GEB.VFATS.VFATX node contains both rw and r type registers
The non-optimal solution is to add BLOCKS of the correct size in each of the rw permission registers, excluding the r only registers
Better would be to move the r only registers to their own block and separate the rw into a different address space, but this requires a redesign of the firmware, which should definitely be done for both V2b and V3
On VFAT2 there are also the "extended registers" and the "extended register pointer" which is how one accesses most of the VFAT registers, and they are currently part of the address table and probably should be removed from the user accessible address space

Proposal for improved implementation

In the "most optimal" way, I would do something like:

<node id=GEB.VFATS  address=0xwhatever>
  <node id=STATUS  address=0x0  description="Contains all readonly VFAT registers">
    <node id=VFAT0 address=0x0>
      <node id=ChipID0 address=0x0 mode="r"/>
---
    </node> <!-- ends VFAT0 readonly node -->
...
    <node id=VFAT23 address=0xvfat23off>
      <node id=ChipID0 address=0x0 mode="r"/>
---
    </node> <!-- ends VFAT23 readonly node -->
  </node>
  <node id=CONFIGURATION  address=0xconfoff>
    <node id=BLOCK  address=0x0  mode="block"  size=24*N VFAT rw regs  permissions="rw"/>
    <node id=VFAT0 address=0x0>
      <node id=BLOCK  address=0x0  mode="block"  size=N VFAT rw regs  permissions="rw"/>
      <node id=REG1 address=0x0 permissions="rw"/>
---
      <node id=VFATChannels address=0xchanoff permissions="rw"/>
        <node id=BLOCK  address=0x0  mode="block" size=24*N VFAT rw regs permissions="rw"/>
      </node> <!-- ends VFATChannels section -->
---
    </node> <!-- closes VFAT0 node -->
...
    <node id=VFAT23 address=0xvfat23off>
      <node id=BLOCK  address=0x0  mode="block" size=N VFAT rw regs permissions="rw"/>
      <node id=REG1 address=0x0 permissions="rw"/>
---
      <node id=VFATChannels address=0xchanoff permissions="rw"/>
        <node id=BLOCK  address=0x0  mode="block" size=24*N VFAT rw regs permissions="rw"/>
      </node> <!-- ends VFATChannels section -->
---
    </node> <!-- closes VFAT23 node -->
  </node> <!-- closes CONFIGURATION node -->
</node> <!-- closes GEB.VFATS node -->

Caveats

Done in this way, the firmware is still responsible for doing the remote register access correctly (e.g., using the extended register and pointer invisibly from the user), but it would be possible to do a block configuration of all VFATS with one transaction, and also of all configurable registers on a single VFAT, if desired (in the case that maybe it is not possible to write to a given VFAT, as I'm not sure what the behaviour is in this case), or even just the VFATChannels node (but only for a given VFAT in this case, and moving this to a new place would need to be yet another discussion)

Anecdotal information

I tested a block read of all the VFAT channel registers (block write was failing, possibly due to ipbus implementation on the CTP7 as the transaction reply did not match the IPBus expectation) and it worked splendidly.

bdorney commented 6 years ago

Sorry forgive my ignorance here. I'm not really sure how this block issue plays out. For some register that gets the same value for all chips it seems like a bcast read/write (e.g. MSPL/CFG_PULSE_STRETCH).

But what happens for those rw registers that need to be set per VFAT and cannot be the same value across the link?

Such registers would be:

VThreshold1 (v2b),
CFG_IREF (v3),
Channel registers/masks/trims/etc... (both)

Probably just my lack of knowledge on what blocks are here.

Additionally what impact does this have on sw development? Is this something that we would be doing to stick with ipbus and uhal or does this have implications for rpc and xhal (are we calling xhal this new repo reg-utils now...?)? Basically what is the architecture we are trying to develop for?

mexanick commented 6 years ago

One register is a 32-bit long block of memory starting from the certain address. But you can also write a block of few registers in one go. Example: you have channel registers, 128 registers of 32 bit long each, located in memory one after another. Instead of doing 128 read or write transactions 32-bit each, you can do a single read or write transaction operating a 128x32bit data block at the address of the first channel register. Since every VFAT is a hardware-defined, it is quite natural to represent it as a single block (eventually with smaller sub-blocks for e.g. channel registers - see @jsturdy example above) and write its configuration as a whole in one transaction. Then in order to set all the VFAT to certain configuration you will need only 24 write transactions instead of over 3k if you do it individually for each register.

bdorney commented 6 years ago

I see, so rather than:

for node_name,val in dict_nodeValuePairs.itermitems():
    writeReg(node_name, val, ...)`

Which is len(dict_nodeValuePairs) transactions you have:

channelVals = ... # size (cuint * 128)  
writeReg(channel.block, channelVals)

So it seems the difficulty is passed to setting up the channelVals, e.g. the thing to be written so it is properly interrupted.

So this increases speed of configuration/register access because there's much less transactions to be performed? When using ipbus I clearly see how this is a considerable advantage. But in the case of rpc this seems like an order epsilon effect since the bulk of the time taken is the RPC message/response itself...?

mexanick commented 6 years ago

It is more than that especially when setting the channel settings. The settings themselves has to be stored as a block, otherwise you will have a lot of expensive seek-n-read operations to get the value (s) you want to write. So the block transactions looks like better organized way to set the things, which is faster, more reliable and easier to maintain. Also it won't break anything, just a certain rearrangement of the registers addresses is required.

bdorney commented 6 years ago

I see because for example in VFAT3 the "real" registers are CFG_CTRL_1 through CFG_CTRL_2 and our human readable registers are applying a mask to the bits sent to those addresses to only change a specific bit?

Same for VFAT2.

jsturdy commented 6 years ago

It's a bit more nuanced than this discussion has fleshed out. Both rwreg (by way of the RPC methods) and ipbus have mechanisms to bundle multiple read/write operations into a single transaction

rwreg with a remote function call which then does individual read/write operations on the AXIbus
ipbus with a single dispatch call for multiple operations (the full details of support of this in the UW ipbus server and uhal library itself are unclear to me, as I've seen limitations in the number of bundled transactions that are possible before the transaction returns an error, which is determined by the size of the transaction buffer in the server) I think in addition to the speed improvement from single transactions (which I must add was never considered to be our operational goal), there will be stability improvements, due to taking operations out of the software sphere.

However, as @mexanick was saying, you can consider this block read/write operation as a memory dump (in or out):

This is possible because in the address space (CTP7, OptoHybrid, VFAT all these registers occupy some consecutive block of address)
In the case of a write, you send data to an endpoint (IPBus firmware core on the GLIB , AXIbus on the CTP7) and it is written sequentially into this block of addresses (and for a read it is similar, just in the opposite direction)

The key thing here is that you're reducing the number of transactions on the AXIbus that are being processed (in the case of the CTP7)

Though, how this block read/write differ in the implementations between rwreg and uhal, I can't say (and I've already mentioned some anecdotal evidence that the block write is possibly handicapped in the UW ipbus server, i.e., it's not clear whether the failed block write was due to the server, or due to how the firmware is actually set up)

For the read test that I did (V2b, uhal), I just looked at reading all 128 channel registers, and on average, with single uhal read (TCP overhead I believe, as it's CTP7, but could check with the GLIB where it goes through the control_hub over UDP), what I saw with python profiling was that the 128 individual reads took on average 500ms, while the block read took about 25ms

The improvement with block transactions has a turn on, so a larger block size will not result in a linear increase in the latency
The stated optimal latency performance is 250us (~1 word) to 500us (>1000 words)
The stated optimal throughput (write) performance is 0.1Gb/s (~1000 words) 0.5Gb/s (~10000 words, 0.75Gb/s for read)

At the end of the day, a configuration for a given VFAT will be stored in the DB, and pulled out. It will then be formatted in software into a memory block or collection of individual uint32_ts (I think uhal expects a std::vector<uint32_t> type, and I believe that rwreg is the same) and this object will be pushed into the block write operation (possibly after combination with all other VFATs), sending the data to the endpoint, and letting the firmware do the actual write.

mexanick commented 6 years ago

hehe, there're further nuances :)

It will then be formatted in software into a memory block or collection of individual uint32_ts (I think uhal expects a std::vector<uint32_t> type, and I believe that rwreg is the same)

while in current implementation it is correct, the fact that rwreg expects a collection uint32_ts was made for compatibility with the ipbus. At the CTP7 AXI transactions are implemented memsvc library which allows to read/write an arbitrary size memory block, i.e. we can use

RPCMsg& set_binarydata(std::string key, const void *data, uint32_t bufsize);

and write it in one AXI transaction.

jsturdy commented 6 years ago

Indeed! I was always curious why it wasn't done this way in uhal (though it could have been for simple python compatibility, or maybe it is done this way at the ipbus level)

jsturdy commented 6 years ago

@andrewpeck, @evka85, can you comment on what would be the complications of rearranging the address space thus?

evka85 commented 6 years ago

Hi all,

As I've mentioned before, the CTP7 Zynq and Virtex7 communicate over an AXI-lite protocol, which actually only supports single 32bit transactions, so there wouldn't be any gain at the lowest level (Zynq-Virtex7 communication). It's true that memsvc supports block writes, but under the hood it just writes each 32bit word separately in a loop, just like you would do in your own RPC function if you would use single writes instead of a block write, so it's just an abstraction with no speed benefit. However in uhal case it could reduce the overhead because we would need fewer over-the-network transactions.

In RPC case, I don't think it makes any difference, since you could send a BLOB over the network containing config for all VFATs to a function running in Zynq that would then remap them to individual VFATs. Within a given VFAT, all configuration registers are continuous: VFAT_CHANNELS.CHANNEL0-127 occupy 0x0 - 0x80, and then CFG_0-16 occupy addresses 0x81 through 0x91. So in RPC case, I'm sure there would be no perceivable gain if you just move this remapping to firmware (and save perhaps a few tens of microseconds per chamber).

When using uhal, you can configure one VFAT with a single block transaction, writing to +0x0 through +0x91. But if you wanted to configure all VFATs in a single transaction, yeah then that would require all VFAT rw regs to sit next to each other. Although we ought to check what is the limit on the UHAL transaction size (I'm sure there's a limit, after which it just splits to multiple transactions, which may be abstracted from the user, but I'm pretty sure it's still there). If this limit is really large enough to accommodate enough data for 24 VFATs (13920 bytes), then we would save 24 times the transaction overhead, which is probably on the order of one second or so. If the limit is lower, then we would get diminishing returns depending on the exact size.

Although I believe that we want to do the VFAT configuration over RPC anyway, so uhal is kindof a moot point here..

The reason why I'm being conservative on this is because implementing it in firmware is not straight forward, and in fact it would even increase the latency of every single VFAT reg write because the firmware would have to do the remapping, which would likely have to be an iterative process because it would not be possible to determine the VFAT number by simple bit-wise operations. I don't even have a good idea how to implement it in a reasonable way...

As it is now, the ctp7 firmware takes some upper bits to identify the OH number and the VFAT number, and then takes the lower 8 bits as a direct address in the VFAT address space and forwards that to the VFAT, which is simple and quick. There are a few "special" VFAT registers that are are not continuous with the others in VFAT address space, and I use remapping for those, but this remapping is really simple and is controlled by 2 extra bits in the address (9 and 10): CFG_RUN, HW_ID, HW_ID_VER, TEST_REG, HW_CHIP_ID.

From my testing, when using wreg in a loop running on Zynq, a single 32bit transaction to a VFAT takes around 15 microseconds or so. For each VFAT we have 145 registers to write, so to configure all VFATs in all 12 OHs connected to one CTP7 would take around: 145 24 12 * 15 = 626400us, which is just 0.6 of a second! This seems very fast to me, besides we can do this in parallel on all CTP7s. So the main overhead is in the network + software. If we use RPC, then I think we can make it really fast, probably the whole process can finish in a second or two.

And last, but not least.... Long term, I think we will probably want to reset the VFATs during hard-resets, which would actually require the CTP7 firmware to configure the VFATs after each hard reset without the software being involved at all. For this, we will have to just write the VFAT configs to CTP7 once after loading the CTP7 firmware, or whenever the configs are changed (which I believe would be a rather rare event). This is a bit of a different topic, but in that case we will either have to store it in Zynq RAM (piggy-backed to the OH FPGA bitstream data for promless), or store it in a CTP7 Virtex7 BRAM (latter case would seem like normal register writes, but in this case it would certainly have them occupy a continuous address space).

Best regards, Evaldas

mexanick commented 6 years ago

Thanks for that nice wrap-up! I do believe we shouldn't really concentrate on uhal-based optimizations. And making all 24 VFATs registers next to each other when operating through CTP7 is really redundant, we can easily do 24 block writes.

jsturdy commented 6 years ago

Yes, thanks, it was my fear that an implementation in firmware would be rather tedious, but it's good that VFAT3 is much more sensibly organized (VFAT2 requires no less than 5 block writes due to a chopped up address space)

I wouldn't discount the "long term" issue, we're in the stage of developing the long term system, and it's important to get that right from the beginning (in my mind even at the expense of having something half-assed in the "right-now")

Now is the time to think about, start implementing, and testing this functionality
We can even set it up such that in the lab setups, we do a software triggered HardReset specifically to load the config into the VFATs, rather than doing any writing (we'll be getting the HardReset during the standard CMS starting of a run anyway, right?)

jsturdy commented 6 years ago

Configuring front ends

Assumptions:

Two write methods exist between Virtex7 and Zynq
- dump: fast AXI/DMA which is utilized for the promless bitstream dumping
- rwreg: traditional "slow" AXI/AXI-lite
(Up to) two OptoHybrid firmware images (long/short) will be stored in some CTP7 RAM at predefined offsets
Configuration register values will be obtained from some configuration database and parsed into a blob ready to be written into the same RAM
- The the configuration blob can now be ordered in a few ways:
- OH blobs (one per link) followed by VFAT blobs (one per VFAT) per N links
- OH blob followed by VFAT blobs (one per VFAT) per N links
These blobs will be updated on the CTP7 during an init/cold-init function call in the run control software, triggered/requested whenever the DB has been updated, or when the CTP7 has/needs a firmware reload
- This action is always done by software

Actual front-end configuration

TTC HardReset:
- OH firmware blobs are loaded via the fast dump method
- OH registers are loaded the same way (following the firmware loading)
- Simultaneously (if decided) VFAT blobs will be pushed to the VFATs
Standard "get run going" sequence
- Generate/request TTC HardReset (or our own specific B-Go) as part of the sequence, preferably during Configure if possible, though from the TCDS side, it may only be possible to have a special sequence for the Start B-Go
- If VFAT registers are reloaded by TTC HardReset we are done otherwise:
- Execute special method to block-write (rwreg) extracted blobs to given VFATs (software) or
- Trigger special firmware function that fast dumps VFAT blobs into the VFATs

Considerations

Time limit for recovery from HardReset: if 0.6s is too long for a HardReset, then reconfiguring the VFATs during a HardReset is a non-starter, and this must be done another way
There should be one thing that actually does this configuration (though it could be triggered by several signals, e.g., HardReset, other B-Go, software function call, special firmware register write...) and here I'm advocating for the fast dump method that reads and shoves the blob (under the assumption that the blob is correct for the given VFATs)

bdorney commented 6 years ago

The following comments are in reference to working with V3 Electroncis.

TTC HardReset: OH firmware blobs are loaded via the fast dump method OH registers are loaded the same way (following the firmware loading) Simultaneously (if decided) VFAT blobs will be pushed to the VFATs

I think whenever a TTC HardReset is issued a follow-up link reset must be issued (GEM_AMC.GEM_SYSTEM.CTRL.LINK_RESET) afterward to redo the vfat3 sync procedure but I am not sure. It can be tested if a TTC HardReset causes a loss of sync of the vfat3's but maybe @evka85 knows the answer here (otherwise a simple test can be performed in the lab).

In fact this should be investigated if a TTC HardReset causes the vfat3's to reset their stored register values to their power-on reset values. I think not but it should be cross-checked as this should dictate how we proceed in light of the time requirement mentioned above.

bdorney commented 6 years ago

Yes a TTC HardReset will require all vfat3's be reconfigured either by software or firmware in the present system: http://cmsonline.cern.ch/cms-elog/1039557

mexanick commented 6 years ago

Is it possible to re-sync VFATs without having them drop to power-on settings? If not, why it can't be done (FW setting?). I'm not sure we really need to reload the VFAT settings on each hard reset, and moreover, it would be nice to study how long can we actually keep our settings intact.

bdorney commented 6 years ago

Is it possible to re-sync VFATs without having them drop to power-on settings? If not, why it can't be done (FW setting?). I'm not sure we really need to reload the VFAT settings on each hard reset, and moreover, it would be nice to study how long can we actually keep our settings intact.

A resync is executed when GEM_AMC.GEM_SYSTEM.CTRL.LINK_RESET is performed. This does not cause the VFATs to lose their configuration:

0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_PULSE_STRETCH              0x00000004
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SYNC_LEVEL_MODE            0x00000000
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SELF_TRIGGER_MODE          0x00000000
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_DDR_TRIGGER_MODE           0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SPZS_SUMMARY_ONLY          0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SPZS_MAX_PARTITIONS        0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SPZS_ENABLE                0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SZP_ENABLE                 0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SZD_ENABLE                 0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_TIME_TAG                   0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_EC_BYTES                   0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BC_BYTES                   0x00000000
0x6540020c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_FP_FE                      0x00000007
0x6540020c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_RES_PRE                    0x00000002
0x6540020c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAP_PRE                    0x00000001
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_PT                         0x00000003
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_EN_HYST                    0x00000001
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SEL_POL                    0x00000001
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_FORCE_EN_ZCC               0x00000000
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_FORCE_TH                   0x00000000
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SEL_COMP_MODE              0x00000000
0x65400214 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_VREF_ADC                   0x00000003
0x65400214 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_MON_GAIN                   0x00000000
0x65400214 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_MONITOR_SELECT             0x00000000
0x65400218 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_IREF                       0x00000022
0x6540021c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_THR_ZCC_DAC                0x0000000a
0x6540021c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_THR_ARM_DAC                0x00000064
0x65400220 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_HYST                       0x00000005
0x65400224 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_LATENCY                    0x00000062
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_SEL_POL                0x00000001
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_PHI                    0x00000000
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_EXT                    0x00000000
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_DAC                    0x000000c8
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_MODE                   0x00000000
0x6540022c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_FS                     0x00000000
0x6540022c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_DUR                    0x000000c8
0x65400230 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_CFD_DAC_2             0x00000028
0x65400230 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_CFD_DAC_1             0x00000028
0x65400234 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_I_BSF             0x0000000d
0x65400234 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_I_BIT             0x00000096
0x65400238 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_I_BLCC            0x00000019
0x65400238 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_VREF              0x00000056
0x6540023c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SH_I_BFCAS            0x000000fa
0x6540023c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SH_I_BDIFF            0x00000096
0x65400240 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SH_I_BFAMP            0x00000000
0x65400240 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SD_I_BDIFF            0x000000ff
0x65400244 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SD_I_BSF              0x0000000f
0x65400244 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SD_I_BFCAS            0x000000ff
0x65400c00 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_RUN                        0x00000000
eagle60 > write GEM_AMC.GEM_SYSTEM.CTRL.LINK_RESET 1
Initial value to write: 1, register GEM_AMC.GEM_SYSTEM.CTRL.LINK_RESET
0x00000001(1)   written to GEM_AMC.GEM_SYSTEM.CTRL.LINK_RESET
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_PULSE_STRETCH              0x00000004
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SYNC_LEVEL_MODE            0x00000000
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SELF_TRIGGER_MODE          0x00000000
0x65400204 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_DDR_TRIGGER_MODE           0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SPZS_SUMMARY_ONLY          0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SPZS_MAX_PARTITIONS        0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SPZS_ENABLE                0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SZP_ENABLE                 0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SZD_ENABLE                 0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_TIME_TAG                   0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_EC_BYTES                   0x00000000
0x65400208 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BC_BYTES                   0x00000000
0x6540020c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_FP_FE                      0x00000007
0x6540020c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_RES_PRE                    0x00000002
0x6540020c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAP_PRE                    0x00000001
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_PT                         0x00000003
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_EN_HYST                    0x00000001
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SEL_POL                    0x00000001
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_FORCE_EN_ZCC               0x00000000
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_FORCE_TH                   0x00000000
0x65400210 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_SEL_COMP_MODE              0x00000000
0x65400214 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_VREF_ADC                   0x00000003
0x65400214 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_MON_GAIN                   0x00000000
0x65400214 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_MONITOR_SELECT             0x00000000
0x65400218 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_IREF                       0x00000022
0x6540021c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_THR_ZCC_DAC                0x0000000a
0x6540021c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_THR_ARM_DAC                0x00000064
0x65400220 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_HYST                       0x00000005
0x65400224 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_LATENCY                    0x00000062
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_SEL_POL                0x00000001
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_PHI                    0x00000000
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_EXT                    0x00000000
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_DAC                    0x000000c8
0x65400228 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_MODE                   0x00000000
0x6540022c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_FS                     0x00000000
0x6540022c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_CAL_DUR                    0x000000c8
0x65400230 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_CFD_DAC_2             0x00000028
0x65400230 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_CFD_DAC_1             0x00000028
0x65400234 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_I_BSF             0x0000000d
0x65400234 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_I_BIT             0x00000096
0x65400238 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_I_BLCC            0x00000019
0x65400238 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_PRE_VREF              0x00000056
0x6540023c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SH_I_BFCAS            0x000000fa
0x6540023c rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SH_I_BDIFF            0x00000096
0x65400240 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SH_I_BFAMP            0x00000000
0x65400240 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SD_I_BDIFF            0x000000ff
0x65400244 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SD_I_BSF              0x0000000f
0x65400244 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_BIAS_SD_I_BFCAS            0x000000ff
0x65400c00 rw   GEM_AMC.OH.OH0.GEB.VFAT0.CFG_RUN                        0x00000000

Please note the above register list is the result of this configure command:

% confChamber.py --shelf=2 -s5 -g0 --vfatmask=0x1020

And differs significantly from the power on reset values shown in the above elog link (the link's parent).

jsturdy commented 6 years ago

In general it would be good to refer to this TTC Scheduling twiki for understanding how subsystems must behave in response to various TTC commands.

Then we should distinguish between TTC commands received and other actions taken on the system which may have overlapping names.

The overlapping names and terminology should be removed and clarified to avoid confusion

jsturdy commented 5 years ago

(from @evka, via email)

Hi guys,

I'm putting together the blaster interface (almost done), but I need to decide on the register access mechanism to the BRAMs (I have 3 separate BRAMs: for GBTXs, for VFATs, and for OHs).

The easiest way would be to just expose a few registers for address and value, so the user would set the address and then write or read the value register: to write the config: 1) write GEM_AMC.CONFIG_BLASTER.GBTX.ADDRESS 0x00000000 2) write GEM_AMC.CONFIG_BLASTER.GBTX.VALUE 0xdeadbeef 3) write GEM_AMC.CONFIG_BLASTER.GBTX.ADDRESS 0x00000001 4) write GEM_AMC.CONFIG_BLASTER.GBTX.VALUE 0xbaadbabe .... to readback a given address: 1) write GEM_AMC.CONFIG_BLASTER.GBTX.ADDRESS 0x00000000 2) read GEM_AMC.CONFIG_BLASTER.GBTX.VALUE 3) write GEM_AMC.CONFIG_BLASTER.GBTX.ADDRESS 0x00000001 4) read GEM_AMC.CONFIG_BLASTER.GBTX.VALUE

Another approach would be to just expose a continuous address space for those RAMs. So e.g. GEM_AMC.CONFIG_BLASTER.GBTX.RAM would start at address 0x12345678, and end at 0x1234efff, so the user could then just read or write to/from any of these addresses to interact with the config RAM.

to write: 1) mpoke (0x12345678 + 0x00000000) 0xdeadbeef 2) mpoke (0x12345678 + 0x00000001) 0xbaadbabe

to read: 1) mpeek (0x12345678 + 0x00000000) 2) mpeek (0x12345678 + 0x00000001)

As you can see the first approach requires two writes (address and value) to write a single 32bit value, and to read a single 32bit value you need one write and one read. But the advantage is that this works with our current software. The second approach can be implemented in the firmware easily, but the rw_reg would need to be updated to somehow handle these "block registers" where one register can span several addresses (it does work in uHAL I think, but I don't think we have that in our rw_reg library and certainly not in the reg_interface).

In terms of speed the first approach will obviously take twice longer, but the number of writes/reads isn't really that much (about 30000 x 32bit values for all 12 OHs), so when using RPC (or any other method that runs on CTP7) it would only take around 600ms to configure the full system. But doing this through IPbus/uHAL would be much slower than the second method.

I can do it either way in the firmware, but wanted to check with you what your preference is, given all the implications.

Cheers, Evaldas

jsturdy commented 5 years ago

(from @mexanick in reply)

Hi Evaldas,

I would prefer the second way. Moreover, we can read/write a continuous block of 32-bit words. You can do it with memsvc providing the array of data and its length

Cheers, -m

jsturdy commented 5 years ago

Yes, I believe what we had discussed is the second method, where we would send the config via a block write transaction.

With the uHAL address table (so it should be possible with the GEM_AMC address table, just with some new parameter defining a dummy node that doesn't generate any FW), the block node could be multiply defined, i.e., if we want some further granularity, we could have, e.g.,:

<node id=GEM_AMC>
  <node id=CONFIG_BLASTER baseaddr=x size=y />
  <node id=CONFIG_BLASTER.GBTX baseaddr=x+off mode=dummyblock size=m />
  <node id=CONFIG_BLASTER.OH baseaddr=x+off mode=dummyblock size=m />
  <node id=CONFIG_BLASTER.VFATs baseaddr=x+off mode=dummyblock size=m />
</node>

or even,

<node id=GEM_AMC>
  <node id=CONFIG_BLASTER baseaddr=x size=y />
  <node id=CONFIG_BLASTER.LINKN baseaddr=x+off mode=block size=m />
  <node id=CONFIG_BLASTER.LINKN.GBTX baseaddr=x+off mode=dummyblock size=m />
  <node id=CONFIG_BLASTER.LINKN.OH baseaddr=x+off mode=dummyblock size=m />
  <node id=CONFIG_BLASTER.LINKN.VFATs baseaddr=x+off mode=dummyblock size=m />
  <node id=CONFIG_BLASTER.LINKN.VFATZZ baseaddr=x+off mode=dummyblock size=m />
</node>

With this one could simply do (whether uhal or rwreg, as the functions already exist):

writeBlock('GEM_AMC.CONFIG_BLASTER',fullConfigBLOB)

or

writeBlock('GEM_AMC.CONFIG_BLASTER.LINK0',link0ConfigBLOB)

or

writeBlock('GEM_AMC.CONFIG_BLASTER.LINK0.VFAT',link0VFATConfigBLOB)

The exact names might not be possible with subnodes+blocks (I don't remember in uhal, and I don't know for the GEM_AMC generator), but the same structure can be created with a slightly different tree design, i.e., replacing '.' with '_' for anything past 'BLASTER', e.g.,

<node id=GEM_AMC>
  <node id=CONFIG_BLASTER>
    <node id=FULL baseaddr=x mode=block size=y />
    <node id=LINKN baseaddr=x+off mode=block size=m />
    <node id=LINKN_GBTX baseaddr=x+off mode=dummyblock size=m />
    <node id=LINKN_OH baseaddr=x+off mode=dummyblock size=m />
    <node id=LINKN_VFATs baseaddr=x+off mode=dummyblock size=m />
    <node id=LINKN_VFATZZ baseaddr=x+off mode=dummyblock size=m />
  </node>
</node>

All this to say, the configuration RAM should be defined as a block with some predefined size (large enough to contain all the config information we'll be writing).

bdorney commented 5 years ago

I have a preference for nodes which include the link number. This in my opinion would be easiest to integrate into the current SW.

evka85 commented 5 years ago

The firmware with blaster RAM support is available here: https://github.com/evka85/GEM_AMC/releases/tag/v3.6.1 There's extensive documentation with examples in the release page.

I'm still working on the firmware to actually stream the data to the hardware, but the software interface is implemented and working. Please give it a spin and let me know if you see any issues or want to change something.

evka85 commented 5 years ago

For now the release is only on my github page, but I'll put it in the central github repo too after testing, and/or implementing the actual config blaster.

jsturdy commented 5 years ago

OK, I think we can start testing this out with the GBT and VFAT config text files, while the BLOB creation code is being developed. Thanks @evka85 for the hard work!

cms-gem-daq-project / reg_utils