JeremyGrosser / tablesnap

Uses inotify to monitor Cassandra SSTables and upload them to S3
BSD 2-Clause "Simplified" License
181 stars 86 forks source link

tableslurp (on master) is throwing an exception...? #96

Open danwashusen opened 6 years ago

danwashusen commented 6 years ago

I'm testing out the latest tableslurp code (I want/need --recursive) but it seems to be failing with an exception that I don't really understand...

I'm invoking with: tableslurp -n backups/cassandra/dp/staging-data-01 --recursive --aws-region us-west-1 my-ec2-shared /var/lib/cassandra/data /var/lib/cassandra/data/

That seems to progress well for a while but then throws an exception and bails:

tableslurp [2018-06-14 05:45:45,565] INFO Thread #1 finished processing
tableslurp [2018-06-14 05:45:45,570] INFO Thread #2 finished processing
tableslurp [2018-06-14 05:45:45,577] INFO Downloading backups/cassandra/dp/staging-data-01:/var/lib/cassandra/data/dp/fi_b/dp-fi_b-ka-415-Summary.db from my-ec2-shared to /var/lib/cassandra/data/dp/fi_b/dp-fi_b-ka-415-Summary.db
tableslurp [2018-06-14 05:45:45,614] INFO My job is done.
Exception in thread Thread-112 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
  File "/usr/lib/python2.7/threading.py", line 763, in run
  File "/usr/local/bin/tableslurp", line 317, in _worker
<type 'exceptions.AttributeError'>: 'NoneType' object has no attribute 'info'

I'm a bit of a python novice but if I'm reading that stack trace correctly it maps to the following line in tableslurp (line 317): log.info('Thread #%d finished processing' % (idx,))

That seems to suggest that 'log' in undefined...? However, I see several other log lines that show "Thread #2 finished processing" so I'm super confused...

My python novice Google foo suggests it might be something wrong with threading... Anyone have any ideas?

danwashusen commented 6 years ago

It does seem like all the files were copied over...

My current guess is that its related to the worker threads having daemon = true set (https://github.com/danwashusen/tablesnap/blob/master/tableslurp#L334). That doesn't seem correct to me...?

danwashusen commented 6 years ago

Confirmed that using daemon = False on the worker threads 'fixes' the issue...