Closed jjaraalm closed 4 years ago
Can you link to, or attach, the server source?
I'm using pvapy
to host them. Here's a script to setup several test PVs. Also, I can get the error with 5 scalars, but it depends on which field I try to write to. For reference, I'm using epics-base
from conda, currently version 7.0.3.1.
import pvaccess as pva
def create_scalar(value_type):
display = pva.PvObject({
'limitLow': pva.DOUBLE,
'limitHigh': pva.DOUBLE,
'description': pva.STRING,
'format': pva.STRING,
'units': pva.STRING
}, 'display_t')
control = pva.PvObject({
'limitLow': pva.DOUBLE,
'limitHigh': pva.DOUBLE,
'minStep': pva.DOUBLE
}, 'control_t')
return pva.PvObject({
'value': value_type,
'timeStamp': pva.PvTimeStamp(),
'display': display,
'control': control
})
def create_scalar_group(n):
scalar = create_scalar(pva.INT)
return pva.PvObject(dict(zip(['x'+str(i+1) for i in range(n)], [scalar]*n)))
if __name__ == "__main__":
server = pva.PvaServer()
for i in range(1, 5):
pvname = 'test' + str(i)
print("Creating PV: " + pvname)
server.addRecord(pvname, create_scalar_group(i))
input('Press Enter to quit.')
The error message appears to originate on the client side.
https://github.com/epics-base/pvAccessCPP/blob/03b01210211a4f74fa435556522381ab48342b69/src/remoteClient/clientContextImpl.cpp#L441 https://github.com/epics-base/pvAccessCPP/blob/03b01210211a4f74fa435556522381ab48342b69/src/remoteClient/clientContextImpl.cpp#L930-L934
Here pvPutBitSet
is the mask of fields being written. m_bitSet
is used to hold the mask received from the previous network op (INIT or GET). So it can quite legitimately be zero. It's not clear to me why this test is needed.
However, if I have 3 or 5 scalars in the outer structure there are no issues. The problem also occurs with 8 and 16 scalars (although I haven't checked all numbers in between).
It is suggestive that your NTScalar (which is missing alarm) has exactly 16 fields. So the overall composite has 1+16*4
or 65 fields, and thus 65 bits in the changed mask BitSet. 64 is a magic number in the BitSet wire encoding.
However, I'm unable to replicate this issue. Which versions of pvDataCPP, pvAccessCPP, and Base are involved?
Can you try making the following change and report what numbers are printed?
diff --git a/src/remoteClient/clientContextImpl.cpp b/src/remoteClient/clientContextImpl.cpp
index 97632395..ac64250b 100644
--- a/src/remoteClient/clientContextImpl.cpp
+++ b/src/remoteClient/clientContextImpl.cpp
@@ -927,6 +927,7 @@ public:
return;
}
+ std::cerr<<"XXX "<<pvPutBitSet->size()<<" < "<<m_bitSet->size()<<std::endl;
if (pvPutBitSet->size() < m_bitSet->size())
{
EXCEPTION_GUARD3(m_callback, cb, cb->putDone(invalidBitSetLengthStatus, thisPtr));
Thanks for looking into this so quickly. The issue started with a more complex structure. I removed alarms and tried to simplify the problematic PV down as much as possible to isolate the issue.
I've been using the binary distribution of 7.0.3.1 through conda
.
I downloaded and build the 7.0.3.1 source from https://epics-controls.org/download/base/base-7.0.3.1.tar.gz and was able to reproduce it. This is supposed to include pvAccess 7.1.0 I believe.
With your logging and the server script above:
Success with 3:
$ ./pvput -q test3 x3.value=1
XXX 64 < 64
Failure with 4:
$ ./pvput -q test4 x4.value=1
XXX 64 < 128
Error: invalid bit-set length
Failure with 5:
$ ./pvput -q test5 x1.value=1
XXX 64 < 128
Error: invalid bit-set length
Success with 5:
$ ./pvput -q test5 x5.value=1
XXX 128 < 128
So if I understand this right, the client is trying to send only a partial structure and doesn't zero-pad the bitset of the partial structure (64) to the full size of the destination structure (128), which triggers the error.
I was able to reproduce this using a simpler structure with 64 int fields. Only the last one can be written to.
So if I understand this right, the client is trying to send only a partial structure and doesn't zero-pad the bitset of the partial structure (64) to the full size of the destination structure (128), which triggers the error.
This is my understanding as well, though it is actually more complicated than this since the same m_bitSet
is updated if the client fetches the current value. pvget does this in order to have any enum choices list to map string to index.
So I think this error check is itself in error. Can you try to remove it?
In addition to manual testing, please also run the automatic tests with make runtests
and report any failures.
diff --git a/src/remoteClient/clientContextImpl.cpp b/src/remoteClient/clientContextImpl.cpp
index 5fd7344a..a4f31bb4 100644
--- a/src/remoteClient/clientContextImpl.cpp
+++ b/src/remoteClient/clientContextImpl.cpp
@@ -925,12 +925,6 @@ public:
return;
}
- if (pvPutBitSet->size() < m_bitSet->size())
- {
- EXCEPTION_GUARD3(m_callback, cb, cb->putDone(invalidBitSetLengthStatus, thisPtr));
- return;
- }
-
if (!startRequest(m_lastRequest.get() ? QOS_DESTROY : QOS_DEFAULT)) {
EXCEPTION_GUARD3(m_callback, cb, cb->putDone(otherRequestPendingStatus, thisPtr));
return;
Okay, is there a way to run tests for pvaccess only? I tried running all tests from base-7.0.3.1
but it bails out on libcom
with:
epicsSockResolveTest.t ........ Bailout called. Further testing stopped: hostToIPAddr() is broken, testing not possible
I'm not sure how to set TOP
correctly in the pvaccess module directory. I tried make runtests TOP=../..
and similar but could not get it to work.
Your ISP is hijacking DNS replies. The wonders of working from home?
A quick way to get past this is to comment out the line TESTS += epicsSockResolveTest
in modules/libcom/test/Makefile
.
Commented that out and got it to run. All tests passed. I also manually tested against the test PVs I had without issue as well as the original ones that were causing problems.
I've removed this check with 4ef9c1102b422c0217cbb07c6d1a574f5baa4d83
Hi,
I've run into a very annoying issue and I have no idea what's wrong. The simplest case has a PV structure like this:
Where each scalar looks like:
Now I try to write to one of them and I get an error
However, if I have 3 or 5 scalars in the outer structure there are no issues. The problem also occurs with 8 and 16 scalars (although I haven't checked all numbers in between).