ChannelFinder / recsync

EPICS Record Synchronizor
Other
15 stars 25 forks source link

INFO tag data not arriving if alias comes before info tag in record def #49

Closed DanielALS closed 3 years ago

DanielALS commented 3 years ago

When a record contains an alias and an info tag in a record definition, the order appears to matter.

We are running ChannelFinder and recsync on many IOCs and have been populating CF with basic information on the records as well as a custom INFO tag we call "archive".

I have been getting this archive information from CF for a few weeks and fixing small bugs in my code as more IOC's get the archive tag. Recently, many records with both an alias and the archive tag, have not shown up in CF with data from the tag.

I tried changing a few records in the db file, to have the archive tag come first, then the alias. I've added _log.info lines to recceiver code in recast.py to expose when an alias is found, it's recname and rid. I also log the info tag data found. Then I can use grep on the logs to match up the rid with the recname, alias and any info tag data.

The aliases are not consistently showing up with info tag data.

Looking at the code in recsync client, dbcd.c, pushRecord, line 70. I wonder if the return upon detecting an alias happens too soon?

mdavidsaver commented 3 years ago

Aliases aren't communicated as separate entities. Alias names and infos are sent for each record (in that order).

https://github.com/ChannelFinder/recsync/blob/568fe6625dd877b3c22548bfbbbf39550ab8ca89/client/castApp/src/dbcb.c#L79-L97

mdavidsaver commented 3 years ago

https://github.com/ChannelFinder/recsync/blob/568fe6625dd877b3c22548bfbbbf39550ab8ca89/client/castApp/src/dbcb.c#L90-L97

... The rest of the lines I intended to quote

DanielALS commented 3 years ago

OK, so the non-alias record is checked for an alias, and if it has an alias, the info is copied to the subnet. I still need to find why some aliased records are not getting the info tag data.

DanielALS commented 3 years ago

Is it possible that the info list associated with the prec is not being populated properly ? I definitely see a pattern where the placement of the alias and the info tag cause the info to not be found.

mdavidsaver commented 3 years ago

I'm fairly confident that all info() entries make it onto the wire. Can you try running the dbstore plugin alongside cfstore and sqlite3 blah.db .dump the resulting sqlite .db file? If the alias and info()s appear there, then it's clearly a cfstore issue.

DanielALS commented 3 years ago

I will try the SQLITE plugin.

The reason I'm skeptical, is that I placed logging in recast.py directly where the TCP messages are decoded both messages 3 and 6.

From message 3 (addRec): I get the rid and the record name. From message 6 (recInfo): I get rid and the info data.

Then grepping the recname gives me the rid and grepping the rid gets me any info data that was captured. I see no info data when the alias comes before the info tag.

Will update when sqlite test is setup.

DanielALS commented 3 years ago

I've setup the Sqlite test.

It's hard to say what is happening since the DB doesn't seem to be updating with my test IOC's records. I can see the new records in my recsync logging but not in the sql table record_name.

I don't really want to trouble shoot the db store plugin now. If we are checking for data on the wire, then the logs are the low hanging fruit. Next would be WireShark I suppose.

DanielALS commented 3 years ago

I ran tcpdump and I can see that the info data is not on the wire, when the alias comes before the info tag. I'm not sure how to provide a minimal working example here.

ecwilliams commented 3 years ago

Database:

record(ai, "irm:001:ADC0") { info(archive,"policy:3min") # sees this alias("EG__HTRAM00") } record(ai, "irm:001:ADC1") { alias("EG__TEMP___AM01") info(archive,"policy:3min") # doesn't see this }

Simple recsync server result:

incoming client connection from 192.168.2.121 Greet 0xdeadbeef ADDINFO 0 'EPICS_VERSION' 'EPICS 3.14.12.8' ADDINFO 0 'HOSTNAME' 'tinkerboard' ADDINFO 0 'EPICS_BASE' '/usr/local/epics/base' ADDINFO 0 'TOP' '/home/eric/caster' ADDINFO 0 'ARCH' 'linux-arm' ADDINFO 0 'IOC' 'ioccaster' ADDINFO 0 'PWD' '/home/eric/caster/iocBoot/ioccaster' ADDINFO 0 'EPICS_HOST_ARCH' 'linux-arm' ADDINFO 0 'IOCNAME' 'caster' ADDINFO 0 'HOSTNAME' 'tinkerboard' ADDINFO 0 'ENGINEER' 'wd6cmu' ADDINFO 0 'LOCATION' 'home' ADDREC 1 0 'ai' 'irm:001:ADC0' ADDREC 1 1 '' 'EG__HTRAM00' ADDINFO 1 'archive' 'policy:3min' ADDREC 2 0 'ai' 'irm:001:ADC1' ADDREC 2 1 '' 'EG__TEMP___AM01' Upload done

ecwilliams commented 3 years ago

I think the info tag is getting associated with the alias record instead of the original, but you're skipping over those in pushRecord(). Not sure how to pick them up while avoiding pushing duplicate records.

ecwilliams commented 3 years ago

"dbCreateAlias assumes that DBENTRY references a particular record instance and creates an alias for that record. If it returns success, then DBENTRY references the alias just created. " -- App Developer's Guide

As a result of this, while parsing the database file, subsequent info tags are added to the alias's dbRecordNode's infoList, not that of the canonical record. Since the client skips over alias dbRecordNodes, the info tags are not sent to the server.

mdavidsaver commented 3 years ago

Right, I know that dbCreateAlias() now doesn't do this. But I had forgotten that it did prior to lp:1754298. (fix in 3.15.6, 3.16.2, 7.0.2)

@DanielALS I should have asked earlier. What version(s) of Base are you testing with?

ecwilliams commented 3 years ago

Oh, heck, I thought I had discovered something new. I had four possible ways to fix it, that was one of them.

mdavidsaver commented 3 years ago

Will anyone from slac be joining the upcoming codeathon? Moving function documentation into the headers files, and making sure it is correct, is one possible task.

DanielALS commented 3 years ago

@DanielALS I should have asked earlier. What version(s) of Base are you testing with?

EPICS base = 3.15.4

Sounds like this issue is fixed with 3.15.6+

WRT to the codeathon, I would like to join and few others at ALS are interested. I just registered.

mdavidsaver commented 3 years ago

Sounds like this issue is fixed with 3.15.6+

I expect that it is. Please re-open if this isn't the case.