lsst-uk / lasair-lsst

Apache License 2.0
0 stars 0 forks source link

executeLoad is broken #91

Closed RoyWilliams closed 8 months ago

RoyWilliams commented 8 months ago

The ingestion code calls a function executeLoad to put data into Cassandra. When given a list of 139 ForcedSourceOnDiaObjects, only one is actually uploaded! See test case that exhibits the problem on the ingest node:

ubuntu@lasair-lsst-dev-ingest-0:~$ cat testforce.py 
import sys, json
from cassandra.cluster import Cluster
from gkdbutils.ingesters.cassandra import executeLoad

doid = 1998894547110728537
fsodo = json.loads(open('testforce.json').read())
for fs in fsodo:
    assert(fs['diaObjectId'] == doid)
print(len(fsodo), 'forced sources found all from ', doid)

cluster = Cluster(['lasair-lsst-dev-cassandranodes'])
cassandra_session = cluster.connect()
cassandra_session.set_keyspace('lasair')
executeLoad(cassandra_session, 'ForcedSourceOnDiaObjects', fsodo)

Execute it

ubuntu@lasair-lsst-dev-ingest-0:~$ python3 testforce.py 
139 forced sources found all from  1998894547110728537

Now we do a query on cassandra and only one is there:

cqlsh:lasair> select count(*) from forcedsourceondiaobjects where diaObjectId=1998894547110728537;

 count
-------
     1
RoyWilliams commented 8 months ago

No actually its my mistake. Sorry to ruin your Christmas. Problem is the PRIMARY KEY is set to diaObjectId so of course only one entry goes in. Should be diaObjectId,midPointTai, the ID and the time. Sorry!