wimleers / fileconveyor

File Conveyor is a daemon written in Python to detect, process and sync files. In particular, it's designed to sync files to CDNs. Amazon S3 and Rackspace Cloud Files, as well as any Origin Pull or (S)FTP Push CDN, are supported. Originally written for my bachelor thesis at Hasselt University in Belgium.
https://wimleers.com/fileconveyor
The Unlicense
340 stars 95 forks source link

AlreadyExists exception in PersistentQueue #96

Closed mapreferee closed 10 years ago

mapreferee commented 12 years ago

I know it can work. Any help appreciated please.


root@ip-10-122-87-135:/opt/fileconveyor/code# python arbitrator.py 2011-10-20 15:56:28,110 - Arbitrator - WARNING - File Conveyor is initializing. 2011-10-20 15:56:28,115 - Arbitrator - WARNING - Loaded config file. 2011-10-20 15:56:28,588 - Arbitrator - WARNING - Created 's3' transporter for the 's3' server. 2011-10-20 15:56:28,702 - Arbitrator - WARNING - Created 'cf' transporter for the 'cloudfront' server. 2011-10-20 15:56:28,702 - Arbitrator - WARNING - Server connection tests succesful! 2011-10-20 15:56:28,703 - Arbitrator - WARNING - Setup: created transporter pool for the 's3' server. 2011-10-20 15:56:28,703 - Arbitrator - WARNING - Setup: created transporter pool for the 'cloudfront' server. 2011-10-20 15:56:28,706 - Arbitrator - WARNING - Setup: initialized 'pipeline' persistent queue, contains 1332 items. 2011-10-20 15:56:28,706 - Arbitrator - WARNING - Setup: initialized 'files_in_pipeline' persistent list, contains 0 items. 2011-10-20 15:56:28,707 - Arbitrator - WARNING - Setup: initialized 'failed_files' persistent list, contains 5 items. 2011-10-20 15:56:28,708 - Arbitrator - WARNING - Setup: moved 0 items from the 'files_in_pipeline' persistent list into the 'pipeline' persistent queue. Exception in thread ArbitratorThread: Traceback (most recent call last): File "/usr/local/lib/python2.5/threading.py", line 486, in bootstrap_inner self.run() File "arbitrator.py", line 286, in run self.setup() File "arbitrator.py", line 256, in setup self.__allow_retry() File "arbitrator.py", line 806, in allow_retry self.pipeline_queue.put(item) File "/opt/fileconveyor/code/persistent_queue.py", line 106, in put raise AlreadyExists AlreadyExists

^C2011-10-20 15:56:30,971 - Arbitrator - WARNING - Signaling to stop.

p.s. all these are set in Amazon EC2

ykyuen commented 12 years ago

The error could be bypassed by (persistent_queue.py line 72)


self.dbcur.execute("CREATE UNIQUE INDEX IF NOT EXISTS unique_key ON %s (key)" % (self.table))

self.dbcur.execute("CREATE INDEX IF NOT EXISTS unique_key ON %s (key)" % (self.table))


is it necessary to set UNIQUE for the unique_key index?

wimleers commented 12 years ago

I've never, ever seen this problem happening. I have no idea how this could happen.

I can't make this non-unique. Otherwise PersistentQueue.update() could no longer work reliably.

If you can add a test case to persistent_queue_test.py to reproduce this problem, we can fix it, and we'd have a regression test right away.

ykyuen commented 12 years ago

maybe we could reopen this issue when i come across the same problem again. thanks @wimleers.

stongo commented 12 years ago

I was getting almost exactly the same error, and couldn't bypass it as recommended above. Cannot use file conveyor as this error keeps locking it up.

dustincurrie commented 12 years ago

Ran into this same issue today. I stopped the daemon, removed *.db and started the daemon again. That solved it.

kneedrag commented 12 years ago

can also confirm same error and only way I was able to resolve it was to delete the file that existed on the target, and removed the .db files on the fileconveyor side...

kneedrag commented 12 years ago

We continue to experience the same problem...

root@flexiserver:/usr/src/fileconveyor/src/fileconveyo/fileconveyor# python arbitrator.py /usr/src/fileconveyor/src/fileconveyo/fileconveyor/filter.py:10: DeprecationWarning: the sets module is deprecated from sets import Set, ImmutableSet 2012-03-17 10:07:08,072 - Arbitrator - WARNING - File Conveyor is initializing. 2012-03-17 10:07:08,102 - Arbitrator - WARNING - Loaded config file. /usr/local/lib/python2.6/dist-packages/storages/backends/mosso.py:11: PendingDeprecationWarning: The mosso module will be deprecated in version 1.2 of django-storages. The CloudFiles code has been moved intodjango-cumulus at http://github.com/richleland/django-cumulus. PendingDeprecationWarning) 2012-03-17 09:07:10,000 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-cdn-items' server. 2012-03-17 09:07:10,208 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-ww3-items' server. 2012-03-17 09:07:10,415 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-ww2-items' server. 2012-03-17 09:07:10,627 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-ww3-designs' server. 2012-03-17 09:07:10,865 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-ww2-artists' server. 2012-03-17 09:07:10,866 - Arbitrator - WARNING - Created 'mosso' transporter for the 'cloudfiles' server. 2012-03-17 09:07:11,143 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-cdn-artists' server. 2012-03-17 09:07:11,587 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-cdn-designs' server. 2012-03-17 09:07:11,831 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-ww3-artists' server. 2012-03-17 09:07:12,044 - Arbitrator - WARNING - Created 'ftp' transporter for the 'sftp-ww2-designs' server. 2012-03-17 09:07:12,044 - Arbitrator - WARNING - Server connection tests succesful! 2012-03-17 09:07:12,045 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-cdn-items' server. 2012-03-17 09:07:12,045 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-ww3-items' server. 2012-03-17 09:07:12,045 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-ww2-items' server. 2012-03-17 09:07:12,045 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-ww3-designs' server. 2012-03-17 09:07:12,045 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-ww2-artists' server. 2012-03-17 09:07:12,045 - Arbitrator - WARNING - Setup: created transporter pool for the 'cloudfiles' server. 2012-03-17 09:07:12,046 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-cdn-artists' server. 2012-03-17 09:07:12,046 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-cdn-designs' server. 2012-03-17 09:07:12,046 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-ww3-artists' server. 2012-03-17 09:07:12,046 - Arbitrator - WARNING - Setup: created transporter pool for the 'sftp-ww2-designs' server. 2012-03-17 09:07:12,055 - Arbitrator - WARNING - Setup: initialized 'pipeline' persistent queue, contains 51438 items. 2012-03-17 09:07:12,055 - Arbitrator - WARNING - Setup: initialized 'files_in_pipeline' persistent list, contains 0 items. 2012-03-17 09:07:12,101 - Arbitrator - WARNING - Setup: initialized 'failed_files' persistent list, contains 2005 items. 2012-03-17 09:07:12,324 - Arbitrator - WARNING - Setup: initialized 'files_to_delete' persistent list, contains 16361 items. 2012-03-17 09:07:12,324 - Arbitrator - WARNING - Setup: moved 0 items from the 'files_in_pipeline' persistent list into the 'pipeline' persistent queue. Exception in thread ArbitratorThread: Traceback (most recent call last): File "/usr/lib/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "arbitrator.py", line 273, in run self.setup() File "arbitrator.py", line 243, in setup self.__allow_retry() File "arbitrator.py", line 868, in allow_retry self.pipeline_queue.put(item) File "/usr/src/fileconveyor/src/fileconveyo/fileconveyor/persistent_queue.py", line 106, in put raise AlreadyExists AlreadyExists

chrisivens commented 12 years ago

on mine it was caused when the arbitrator raised an exception and I terminated the daemon. That left entries in the failed files queue and after a restart, it filled the queue up again and then I think it tried to add the failed files in on top, causing the error. I just emptied the persistent queue and the failed files one. That cleared it while I tinkered with finding the other exceptions.

adunk commented 11 years ago

For the casual layman getting this error, I discovered I was inadvertently running multiple instances of arbitrator.py on my system. I think I did this by running 'sudo python arbitrator.py &' more than once during the setup process :P To find the processes I used 'ps -A | grep python' and then killed them with 'sudo kill -9 PID'. For good measure I also deleted the fsmonitor.db, persistent_data.db, and synced_files.db files and then restarted fileconveyor. I don't have a large file system so this wasn't too bad, but I can imagine this could be a big pain in the ass for larger websites.

halburgiss commented 11 years ago

Removing *.db did it for me.

wimleers commented 10 years ago

What @adunk says is the most sensible explanation for me. I guess we should protect users from this small mistake. I opened a new issue for that: #151.

dbswebsite commented 10 years ago

In my case there were no multiple instances of arbitrary.py. I read that thread, but that was not it. Removing all the databases fixed it though. And actually another problem I had that seemed to be on the Rackspace end.