wimleers / fileconveyor

File Conveyor is a daemon written in Python to detect, process and sync files. In particular, it's designed to sync files to CDNs. Amazon S3 and Rackspace Cloud Files, as well as any Origin Pull or (S)FTP Push CDN, are supported. Originally written for my bachelor thesis at Hasselt University in Belgium.
https://wimleers.com/fileconveyor
The Unlicense
341 stars 95 forks source link

File Conveyor/Python issue #90

Closed DotConcepts closed 12 years ago

DotConcepts commented 12 years ago

Currently, following the instructions linked to in your INSTALL.TXT for centOS ( http://drupal.org/node/713190#comment-2594204 ); upon attempting to start file conveyor we get the following error:

/opt/fileconveyor/code/pathscanner.py:31: DeprecationWarning: the sets module is deprecated from sets import Set Using class Traceback (most recent call last): File "fsmonitor.py", line 268, in fsmonitor = fsmonitor_class(callbackfunc, True) File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 41, in init FSMonitor.init(self, callback, persistent, trigger_events_for_initial_scan, ignored_dirs, dbfile, parent_logger) File "/opt/fileconveyor/code/fsmonitor.py", line 93, in init self.logger = logging.getLogger(".".join([parent_logger, "FSMonitor"])) TypeError: sequence item 0: expected string, NoneType found

We are running centOS v5.5 with Python v2.6.4 and sqlite v3.6.4.

Any assistance you can provide us with would be greatly appreciated.

wimleers commented 12 years ago

Which version of pyinotify are you using?

Which version of File Conveyor are you using?

DotConcepts commented 12 years ago

Wim,

Thank you for getting back to me. For pyinotify we are running 0.9.2, however for file conveyor all we can find in the way of a version number is D10C55B8.

Also, this is what we are currently getting when trying to run arbitrator.py:

/opt/fileconveyor/code/filter.py:10: DeprecationWarning: the sets module is deprecated from sets import Set, ImmutableSet 2011-09-18 16:16:05,092 - Arbitrator - WARNING - File Conveyor is initializing. 2011-09-18 16:16:05,103 - Arbitrator - WARNING - Loaded config file. 2011-09-18 16:16:06,155 - Arbitrator - WARNING - Created 'mosso' transporter for the 'cloudfiles' server. 2011-09-18 16:16:06,155 - Arbitrator - WARNING - Server connection tests succesful! 2011-09-18 16:16:06,163 - Arbitrator - WARNING - Setup: created transporter pool for the 'cloudfiles' server. 2011-09-18 16:16:06,168 - Arbitrator - WARNING - Setup: initialized 'pipeline' persistent queue, contains 0 items. 2011-09-18 16:16:06,170 - Arbitrator - WARNING - Setup: initialized 'files_in_pipeline' persistent list, contains 0 items. 2011-09-18 16:16:06,171 - Arbitrator - WARNING - Setup: initialized 'failed_files' persistent list, contains 0 items. 2011-09-18 16:16:06,172 - Arbitrator - WARNING - Setup: moved 0 items from the 'files_in_pipeline' persistent list into the 'pipeline' persistent queue. 2011-09-18 16:16:06,175 - Arbitrator - WARNING - Setup: connected to the synced files DB. Contains metadata for 0 previously synced files. 2011-09-18 16:16:06,323 - Arbitrator - WARNING - Setup: initialized FSMonitor. 2011-09-18 16:16:06,325 - Arbitrator - WARNING - Fully up and running now. Exception in thread FSMonitorThread: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 122, in run self.process_queues() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 155, in process_queues self.__add_dir(path, event_mask) File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 91, in add_dir FSMonitor.generate_missed_events(self, path) File "/opt/fileconveyor/code/fsmonitor.py", line 128, in generate_missed_events for event_path, result in self.pathscanner.scan_tree(path): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 223, in scan_tree path = path.decode('utf-8') File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 55: ordinal not in range(128)

^C2011-09-18 16:17:09,276 - Arbitrator - WARNING - Signaling to stop. 2011-09-18 16:17:09,304 - Arbitrator - WARNING - Stopping. 2011-09-18 16:17:09,307 - Arbitrator - WARNING - Stopped FSMonitor. 2011-09-18 16:17:09,307 - Arbitrator - WARNING - 'pipeline' persistent queue contains 2369 items. 2011-09-18 16:17:09,307 - Arbitrator - WARNING - 'files_in_pipeline' persistent list contains 0 items. 2011-09-18 16:17:09,308 - Arbitrator - WARNING - 'failed_files' persistent list contains 0 items. 2011-09-18 16:17:09,308 - Arbitrator - WARNING - synced files DB contains metadata for 0 synced files. 2011-09-18 16:17:09,308 - Arbitrator - WARNING - File Conveyor has shut down.

wimleers commented 12 years ago

Is it possible that it's not D10C55B8, but d1c55b89a1? If so, you're running the very last version of File Conveyor. (Also, yes, I know, we desperately need proper version numbers!)

That's a Unicode issue. Could you figure out which file it's choking on, so I can try to reproduce the problem?

wimleers commented 12 years ago

Also, did you upgrade File Conveyor's code from a previous version in this particular instance? If you did, did you run File Conveyor before that code upgrade? Because File Conveyor may be storing filenames slightly differently in the database now — you had to run the included upgrade script. It should work flawlessly (it did for ), but maybe there's still a problem in there somewhere. (This is an unlikely cause of the problem you reported on September 18, but it is a remote possibility.)

DotConcepts commented 12 years ago

Wim,

You are correct on the version number, my fingers apparently got away from me there.

This is a fresh install of FC, not an upgrade. As requested I made sure to update both logging level lines and this is the full contents of /opt/fileconveyor/code/daemon.log:

2011-09-27 02:47:10,176 - Arbitrator - WARNING - File Conveyor is initializing. 2011-09-27 02:47:10,176 - Arbitrator - INFO - Loading config file. 2011-09-27 02:47:10,178 - Arbitrator.Config - INFO - Parsing sources. 2011-09-27 02:47:10,178 - Arbitrator.Config - INFO - Parsing servers. 2011-09-27 02:47:10,179 - Arbitrator.Config - INFO - Parsing rules. 2011-09-27 02:47:10,179 - Arbitrator - WARNING - Loaded config file. 2011-09-27 02:47:10,432 - Arbitrator - WARNING - Created 'mosso' transporter for the 'cloudfiles' server. 2011-09-27 02:47:10,432 - Arbitrator - WARNING - Server connection tests succesful! 2011-09-27 02:47:10,433 - Arbitrator - WARNING - Setup: created transporter pool for the 'cloudfiles' server. 2011-09-27 02:47:10,434 - Arbitrator - INFO - Setup: collected all metadata for rule 'All files related to games' (source: 'drupal'). 2011-09-27 02:47:10,435 - Arbitrator - WARNING - Setup: initialized 'pipeline' persistent queue, contains 1 items. 2011-09-27 02:47:10,435 - Arbitrator - WARNING - Setup: initialized 'files_in_pipeline' persistent list, contains 0 items. 2011-09-27 02:47:10,436 - Arbitrator - WARNING - Setup: initialized 'failed_files' persistent list, contains 0 items. 2011-09-27 02:47:10,436 - Arbitrator - WARNING - Setup: moved 0 items from the 'files_in_pipeline' persistent list into the 'pipeline' persistent queue. 2011-09-27 02:47:10,437 - Arbitrator - WARNING - Setup: connected to the synced files DB. Contains metadata for 0 previously synced files. 2011-09-27 02:47:10,573 - Arbitrator.FSMonitor - INFO - FSMonitor class used: FSMonitorInotify. 2011-09-27 02:47:10,573 - Arbitrator - WARNING - Setup: initialized FSMonitor. 2011-09-27 02:47:10,574 - Arbitrator - INFO - Setup: monitoring '/var/www/html/sites/default/files' (drupal). 2011-09-27 02:47:10,575 - Arbitrator - INFO - Cleaned up the working directory '/tmp/daemon'. 2011-09-27 02:47:10,575 - Arbitrator - WARNING - Fully up and running now. 2011-09-27 02:47:10,584 - Arbitrator - INFO - Pipeline queue -> filter queue: '/var/www/html/sites/default/files/new_game.png'. 2011-09-27 02:47:10,612 - Arbitrator - INFO - Filter queue: dropped '/var/www/html/sites/default/files/new_game.png' because it doesn't match any rules. 2011-09-27 02:47:23,582 - Arbitrator.FSMonitor - INFO - Generating missed events for '/var/www/html/sites/default/files' (event mask: None). 2011-09-27 02:48:16,236 - Arbitrator - WARNING - Signaling to stop. 2011-09-27 02:48:16,276 - Arbitrator - WARNING - Stopping. 2011-09-27 02:48:16,279 - Arbitrator - WARNING - Stopped FSMonitor. 2011-09-27 02:48:16,279 - Arbitrator - INFO - Final sync of discover queue to pipeline queue made. 2011-09-27 02:48:16,279 - Arbitrator - WARNING - 'pipeline' persistent queue contains 0 items. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - 'files_in_pipeline' persistent list contains 0 items. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - 'failed_files' persistent list contains 0 items. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - synced files DB contains metadata for 0 synced files. 2011-09-27 02:48:16,280 - Arbitrator - INFO - Cleaned up the working directory '/tmp/daemon'. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - File Conveyor has shut down.

and from the console:

[root@ggj4 code]# python arbitrator.py /opt/fileconveyor/code/filter.py:10: DeprecationWarning: the sets module is deprecated from sets import Set, ImmutableSet 2011-09-27 02:47:10,176 - Arbitrator - WARNING - File Conveyor is initializing. 2011-09-27 02:47:10,176 - Arbitrator - INFO - Loading config file. 2011-09-27 02:47:10,178 - Arbitrator.Config - INFO - Parsing sources. 2011-09-27 02:47:10,178 - Arbitrator.Config - INFO - Parsing servers. 2011-09-27 02:47:10,179 - Arbitrator.Config - INFO - Parsing rules. 2011-09-27 02:47:10,179 - Arbitrator - WARNING - Loaded config file. 2011-09-27 02:47:10,432 - Arbitrator - WARNING - Created 'mosso' transporter for the 'cloudfiles' server. 2011-09-27 02:47:10,432 - Arbitrator - WARNING - Server connection tests succesful! 2011-09-27 02:47:10,433 - Arbitrator - WARNING - Setup: created transporter pool for the 'cloudfiles' server. 2011-09-27 02:47:10,434 - Arbitrator - INFO - Setup: collected all metadata for rule 'All files related to games' (source: 'drupal'). 2011-09-27 02:47:10,435 - Arbitrator - WARNING - Setup: initialized 'pipeline' persistent queue, contains 1 items. 2011-09-27 02:47:10,435 - Arbitrator - WARNING - Setup: initialized 'files_in_pipeline' persistent list, contains 0 items. 2011-09-27 02:47:10,436 - Arbitrator - WARNING - Setup: initialized 'failed_files' persistent list, contains 0 items. 2011-09-27 02:47:10,436 - Arbitrator - WARNING - Setup: moved 0 items from the 'files_in_pipeline' persistent list into the 'pipeline' persistent queue. 2011-09-27 02:47:10,437 - Arbitrator - WARNING - Setup: connected to the synced files DB. Contains metadata for 0 previously synced files. 2011-09-27 02:47:10,573 - Arbitrator.FSMonitor - INFO - FSMonitor class used: FSMonitorInotify. 2011-09-27 02:47:10,573 - Arbitrator - WARNING - Setup: initialized FSMonitor. 2011-09-27 02:47:10,574 - Arbitrator - INFO - Setup: monitoring '/var/www/html/sites/default/files' (drupal). 2011-09-27 02:47:10,575 - Arbitrator - INFO - Cleaned up the working directory '/tmp/daemon'. 2011-09-27 02:47:10,575 - Arbitrator - WARNING - Fully up and running now. 2011-09-27 02:47:10,584 - Arbitrator - INFO - Pipeline queue -> filter queue: '/var/www/html/sites/default/files/new_game.png'. 2011-09-27 02:47:10,612 - Arbitrator - INFO - Filter queue: dropped '/var/www/html/sites/default/files/new_game.png' because it doesn't match any rules. 2011-09-27 02:47:23,582 - Arbitrator.FSMonitor - INFO - Generating missed events for '/var/www/html/sites/default/files' (event mask: None). Exception in thread FSMonitorThread: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 122, in run self.process_queues() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 155, in process_queues self.__add_dir(path, event_mask) File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 91, in add_dir FSMonitor.generate_missed_events(self, path) File "/opt/fileconveyor/code/fsmonitor.py", line 128, in generate_missed_events for event_path, result in self.pathscanner.scan_tree(path): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 239, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 223, in scan_tree path = path.decode('utf-8') File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 55: ordinal not in range(128) 2011-09-27 02:48:16,236 - Arbitrator - WARNING - Signaling to stop. 2011-09-27 02:48:16,276 - Arbitrator - WARNING - Stopping. 2011-09-27 02:48:16,279 - Arbitrator - WARNING - Stopped FSMonitor. 2011-09-27 02:48:16,279 - Arbitrator - INFO - Final sync of discover queue to pipeline queue made. 2011-09-27 02:48:16,279 - Arbitrator - WARNING - 'pipeline' persistent queue contains 0 items. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - 'files_in_pipeline' persistent list contains 0 items. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - 'failed_files' persistent list contains 0 items. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - synced files DB contains metadata for 0 synced files. 2011-09-27 02:48:16,280 - Arbitrator - INFO - Cleaned up the working directory '/tmp/daemon'. 2011-09-27 02:48:16,280 - Arbitrator - WARNING - File Conveyor has shut down.

And, our config.xml (I replaced < and > with [ and ] since I could not figure out how to make it display otherwise):

[?xml version="1.0" encoding="UTF-8"?] [config] [!-- Sources --] [sources] [source name="drupal" scanPath="/var/www/html/sites/default/files" /] [/sources]

[!-- Servers --] [servers] [server name="cloudfiles" transporter="mosso"] [username]**_[/username] [api_key]_**[/api_key] [container]gamefiles[/container] [/server] [/servers]

[!-- Rules --] [rules] [rule for="drupal" label="All files related to games"] [filter] [paths]/uploads[/paths] [/filter] [processorChain] [processor name="filename.SpacesToUnderscores" /] [/processorChain] [destinations] [destination server="cloudfiles" path="static" /] [/destinations] [/rule] [/rules] [/config]

Unfortunately, this all looks the same to me. I hope you find something in here that is useful.

Thanks.

wimleers commented 12 years ago

1) You can use the following notation to make XML display properly:

xml


2) That being said: yes, if nothing is being done (i.e. the only file being detected doesn't match any rules, hence nothing needs to be done), then DEBUG level logging has nothing to log, hence it doesn't make a difference. Just add a file that does match a rule.


3) The problem you reported in your first post should only be reproducible when calling fsmonitor.py directly, as follows:

python fsmonitor.py

In this case, its __main__ method will get executed, which would indeed trigger the problem you first reported. Fixed in 54a154c15b.


4) However, the UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 55: ordinal not in range(128) error, I cannot reproduce. Could you add the following code on line 222 (just before the path.decode() call) in pathscanner.py?

print path

That should allow us to see where this is failing.

DotConcepts commented 12 years ago

Wim,

As requested, I added the line to the pathscanner.py and the following output is generated to the console. It would appear, as per your original assessment that is it choking on a unicode character in '/var/www/html/sites/default/files/screenshots/Pega Ladrão'.

Thanks, Michael Kinney

[root@ggj4 code]# python arbitrator.py /opt/fileconveyor/code/filter.py:10: DeprecationWarning: the sets module is deprecated from sets import Set, ImmutableSet 2011-09-27 11:28:14,773 - Arbitrator - WARNING - File Conveyor is initializing. 2011-09-27 11:28:14,773 - Arbitrator - INFO - Loading config file. 2011-09-27 11:28:14,775 - Arbitrator.Config - INFO - Parsing sources. 2011-09-27 11:28:14,775 - Arbitrator.Config - INFO - Parsing servers. 2011-09-27 11:28:14,776 - Arbitrator.Config - INFO - Parsing rules. 2011-09-27 11:28:14,776 - Arbitrator - WARNING - Loaded config file. 2011-09-27 11:28:15,014 - Arbitrator - WARNING - Created 'mosso' transporter for the 'cloudfiles' server. 2011-09-27 11:28:15,014 - Arbitrator - WARNING - Server connection tests succesful! 2011-09-27 11:28:15,015 - Arbitrator - WARNING - Setup: created transporter pool for the 'cloudfiles' server. 2011-09-27 11:28:15,015 - Arbitrator - INFO - Setup: collected all metadata for rule 'All files related to games' (source: 'drupal'). 2011-09-27 11:28:15,016 - Arbitrator - WARNING - Setup: initialized 'pipeline' persistent queue, contains 0 items. 2011-09-27 11:28:15,017 - Arbitrator - WARNING - Setup: initialized 'files_in_pipeline' persistent list, contains 0 items. 2011-09-27 11:28:15,017 - Arbitrator - WARNING - Setup: initialized 'failed_files' persistent list, contains 0 items. 2011-09-27 11:28:15,018 - Arbitrator - WARNING - Setup: moved 0 items from the 'files_in_pipeline' persistent list into the 'pipeline' persistent queue. 2011-09-27 11:28:15,018 - Arbitrator - WARNING - Setup: connected to the synced files DB. Contains metadata for 0 previously synced files. 2011-09-27 11:28:15,153 - Arbitrator.FSMonitor - INFO - FSMonitor class used: FSMonitorInotify. 2011-09-27 11:28:15,154 - Arbitrator - WARNING - Setup: initialized FSMonitor. 2011-09-27 11:28:15,154 - Arbitrator - INFO - Setup: monitoring '/var/www/html/sites/default/files' (drupal). 2011-09-27 11:28:15,155 - Arbitrator - INFO - Cleaned up the working directory '/tmp/daemon'. 2011-09-27 11:28:15,155 - Arbitrator - WARNING - Fully up and running now. 2011-09-27 11:28:28,632 - Arbitrator.FSMonitor - INFO - Generating missed events for '/var/www/html/sites/default/files' (event mask: None). /var/www/html/sites/default/files /var/www/html/sites/default/files/users /var/www/html/sites/default/files/locationImages /var/www/html/sites/default/files/f19 /var/www/html/sites/default/files/pictures /var/www/html/sites/default/files/css /var/www/html/sites/default/files/countdowntimer /var/www/html/sites/default/files/screenshots /var/www/html/sites/default/files/screenshots/RobertoGelati and the Bikers /var/www/html/sites/default/files/screenshots/Team Synergy Group /var/www/html/sites/default/files/screenshots/The Left Handed Penelopes /var/www/html/sites/default/files/screenshots/Team Flash Jelly I /var/www/html/sites/default/files/screenshots/Likide - 2D /var/www/html/sites/default/files/screenshots/Nineteen Llamas /var/www/html/sites/default/files/screenshots/Autobots /var/www/html/sites/default/files/screenshots/Stinkypants Productions /var/www/html/sites/default/files/screenshots/Team 4lfa /var/www/html/sites/default/files/screenshots/C.H.U.M. /var/www/html/sites/default/files/screenshots/Team Lucid /var/www/html/sites/default/files/screenshots/Crazy Bomber /var/www/html/sites/default/files/screenshots/Let's Shooting Love /var/www/html/sites/default/files/screenshots/Nanosaurus, Inc. /var/www/html/sites/default/files/screenshots/Estados Unidos Mexicana /var/www/html/sites/default/files/screenshots/Awesome Walrus /var/www/html/sites/default/files/screenshots/48 Hours Later /var/www/html/sites/default/files/screenshots/Alex Angels /var/www/html/sites/default/files/screenshots/The Confusion Factor /var/www/html/sites/default/files/screenshots/We Must Protect This House! /var/www/html/sites/default/files/screenshots/R-O-B-O-T-V-O-I-C-E /var/www/html/sites/default/files/screenshots/Shipit Productions /var/www/html/sites/default/files/screenshots/You Are Fired /var/www/html/sites/default/files/screenshots/Monochrome /var/www/html/sites/default/files/screenshots/Plead the Fifth /var/www/html/sites/default/files/screenshots/SafetyScissors2 /var/www/html/sites/default/files/screenshots/Team Pterodactyl /var/www/html/sites/default/files/screenshots/Crazy Badgers /var/www/html/sites/default/files/screenshots/TeamPenguin /var/www/html/sites/default/files/screenshots/Drunkards /var/www/html/sites/default/files/screenshots/f19 /var/www/html/sites/default/files/screenshots/Equipo1ylla /var/www/html/sites/default/files/screenshots/team_power /var/www/html/sites/default/files/screenshots/Jelly Jam /var/www/html/sites/default/files/screenshots/quarteto_fantastico /var/www/html/sites/default/files/screenshots/Quicksand /var/www/html/sites/default/files/screenshots/TippSoc.GGJ.09.3 /var/www/html/sites/default/files/screenshots/Pega Ladrão Exception in thread FSMonitorThread: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 122, in run self.process_queues() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 155, in process_queues self.__add_dir(path, event_mask) File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 91, in add_dir FSMonitor.generate_missed_events(self, path) File "/opt/fileconveyor/code/fsmonitor.py", line 128, in generate_missed_events for event_path, result in self.pathscanner.scan_tree(path): File "/opt/fileconveyor/code/pathscanner.py", line 240, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 240, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 224, in scan_tree path = path.decode('utf-8') File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 55: ordinal not in range(128)``

On Tue, Sep 27, 2011 at 01:44, Wim Leers < reply@reply.github.com>wrote:

You can use the following notation to make XML display properly:


That being said: yes, if nothing is being done (i.e. the only file being
detected doesn't match any rules, hence nothing needs to be done), then
DEBUG level logging has nothing to log, hence it doesn't make a difference.
Just add a file that *does* match a rule.

The problem you reported in your first post should only be reproducible
when calling `fsmonitor.py` directly, as follows:

`python fsmonitor.py`

In this case, its `__main__` method will get executed, which would indeed
trigger the problem you first reported. Fixed in 54a154c15b.

However, the `UnicodeEncodeError: 'ascii' codec can't encode character
u'\xe3' in position 55: ordinal not in range(128)` error, I cannot
reproduce. Could you add the following code on line 222 (just before the
`path.decode()` call) in `pathscanner.py`?

`print path`

That should allow us to see where this is failing.

--
Reply to this email directly or view it on GitHub:
https://github.com/wimleers/fileconveyor/issues/90#issuecomment-2207504
wimleers commented 12 years ago

It worked just fine for me on OS X: 2011-10-02 15:41:19,530 - Arbitrator - WARNING - Synced: '/Users/wimleers/Desktop/watch/Pega Ladrão.png' (CREATED). The character in position 55 is indeed the à character. Your log reports this to be the u'\xe3' character, but my Python instance says this is the lowercase variant:

python
>>> a = u'\xe3'
>>> print a
ã

However, if I manually try to decode it, I can reproduce your problem:

b = a.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 0: ordinal not in range(128)

Continuing the investigation now.

wimleers commented 12 years ago

The reason this is failing, is because I'm attempting to decode from UTF-8, while it's already a Unicode string. There's no point in decoding to Unicode if it's already in Unicode, obviously.

I asked you to add a print path line. Could you now please simply replace the contents of your pathscanner.py file with this one: https://gist.github.com/1258086? That should give me the final clues that I need.

DotConcepts commented 12 years ago

Wim,

As requested we updated pathscanner.py and got the following output:

': Set([])} listdir() [u'Screenshot_0.png.thumb.jpg', u'Screenshot_0.png', u'Screenshot.png.thumb.jpg', u'Screenshot_1.png.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Let's Shooting Love <type 'unicode'> /var/www/html/sites/default/files/screenshots/Let's Shooting Love listdir() [u'LSL-title.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png", u'LSL-title_1.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png.thumb.jpg", u'LSL-title_0.JPG.thumb.jpg', u'LSL SUPER TITLE.GIF.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'LSL-title.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png", u'LSL-title_1.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png.thumb.jpg", u'LSL-title_0.JPG.thumb.jpg', u'LSL SUPER TITLE.GIF.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Nanosaurus, Inc. <type 'unicode'> /var/www/html/sites/default/files/screenshots/Nanosaurus, Inc. __listdir() [u'sumo_screenshot_0.JPG.thumb.jpg', u'sumo_screenshot_1.JPG.thumb.jpg', u'sumo_screenshot.JPG.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'sumo_screenshot_0.JPG.thumb.jpg', u'sumo_screenshot_1.JPG.thumb.jpg', u'sumo_screenshot.JPG.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Estados Unidos Mexicana <type 'unicode'> /var/www/html/sites/default/files/screenshots/Estados Unidos Mexicana listdir() [u'BorderTrouble.jpg', u'BorderTrouble.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'BorderTrouble.jpg', u'BorderTrouble.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Awesome Walrus <type 'unicode'> /var/www/html/sites/default/files/screenshots/Awesome Walrus listdir() [u'screenshot.JPG.thumb.jpg', u'screenshot.JPG'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot.JPG.thumb.jpg', u'screenshot.JPG'] scan_tree() /var/www/html/sites/default/files/screenshots/48 Hours Later <type 'unicode'> /var/www/html/sites/default/files/screenshots/48 Hours Later listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg', u'press_0.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg', u'press_0.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Alex Angels <type 'unicode'> /var/www/html/sites/default/files/screenshots/Alex Angels listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/The Confusion Factor <type 'unicode'> /var/www/html/sites/default/files/screenshots/The Confusion Factor listdir() [u'screenshot_0.jpg.thumb.jpg', u'screenshot_1.jpg.thumb.jpg', u'screenshot_1.jpg', u'screenshot.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot_0.jpg.thumb.jpg', u'screenshot_1.jpg.thumb.jpg', u'screenshot_1.jpg', u'screenshot.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/We Must Protect This House! <type 'unicode'> /var/www/html/sites/default/files/screenshots/We Must Protect This House! listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/R-O-B-O-T-V-O-I-C-E <type 'unicode'> /var/www/html/sites/default/files/screenshots/R-O-B-O-T-V-O-I-C-E listdir() [u'screenshot.JPG.thumb.jpg', u'blocking_shadows_0.jpg.thumb.jpg', u'blocking_shadows.jpg.thumb.jpg', u'screenshot_0.JPG', u'blocking_shadows_1.jpg.thumb.jpg', u'screenshot_0.JPG.thumb.jpg', u'blocking_shadows_2.jpg.thumb.jpg', u'blocking_shadows.jpg', u'blocking_shadows_3.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot.JPG.thumb.jpg', u'blocking_shadows_0.jpg.thumb.jpg', u'blocking_shadows.jpg.thumb.jpg', u'screenshot_0.JPG', u'blocking_shadows_1.jpg.thumb.jpg', u'screenshot_0.JPG.thumb.jpg', u'blocking_shadows_2.jpg.thumb.jpg', u'blocking_shadows.jpg', u'blocking_shadows_3.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Shipit Productions <type 'unicode'> /var/www/html/sites/default/files/screenshots/Shipit Productions listdir() [u'LittleBigFortsScreen.png', u'LittleBigFortsScreen.png.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'LittleBigFortsScreen.png', u'LittleBigFortsScreen.png.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/You Are Fired <type 'unicode'> /var/www/html/sites/default/files/screenshots/You Are Fired listdir() [u'kids_room.jpg.thumb.jpg', u'kids_room.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'kids_room.jpg.thumb.jpg', u'kids_room.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Monochrome <type 'unicode'> /var/www/html/sites/default/files/screenshots/Monochrome listdir() [u'screenshot_0.png.thumb.jpg', u'screenshot.png.thumb.jpg', u'screenshot.png'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot_0.png.thumb.jpg', u'screenshot.png.thumb.jpg', u'screenshot.png'] scan_tree() /var/www/html/sites/default/files/screenshots/Plead the Fifth <type 'unicode'> /var/www/html/sites/default/files/screenshots/Plead the Fifth listdir() [u'mini-add.jpg.thumb.jpg', u'mini-add.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} __listdir() [u'mini-add.jpg.thumb.jpg', u'mini-add.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/SafetyScissors2 <type 'unicode'> /var/www/html/sites/default/files/screenshots/SafetyScissors2 listdir() [u'Untitled-3.jpg', u'Untitled-3.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'Untitled-3.jpg', u'Untitled-3.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Team Pterodactyl <type 'unicode'> /var/www/html/sites/default/files/screenshots/Team Pterodactyl __listdir() [u'Press_0.jpg.thumb.jpg', u'Press.jpg.thumb.jpg', u'Press_1.jpg.thumb.jpg', u'Press_1.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'Press_0.jpg.thumb.jpg', u'Press.jpg.thumb.jpg', u'Press_1.jpg.thumb.jpg', u'Press_1.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Crazy Badgers <type 'unicode'> /var/www/html/sites/default/files/screenshots/Crazy Badgers listdir() [u'la_id_0.jpg', u'la_id_0.jpg.thumb.jpg', u'la_id.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'la_id_0.jpg', u'la_id_0.jpg.thumb.jpg', u'la_id.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/TeamPenguin <type 'unicode'> /var/www/html/sites/default/files/screenshots/TeamPenguin listdir() [u'Penguin.jpg.thumb.jpg', u'Penguin.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} __listdir() [u'Penguin.jpg.thumb.jpg', u'Penguin.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Drunkards <type 'unicode'> /var/www/html/sites/default/files/screenshots/Drunkards listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/f19 <type 'unicode'> /var/www/html/sites/default/files/screenshots/f19 listdir() [] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [] scan_tree() /var/www/html/sites/default/files/screenshots/Equipo1ylla <type 'unicode'> /var/www/html/sites/default/files/screenshots/Equipo1ylla listdir() [u'Gnaka-Gnaka.jpg.thumb.jpg', u'Gnaka-Gnaka.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'Gnaka-Gnaka.jpg.thumb.jpg', u'Gnaka-Gnaka.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/team_power <type 'unicode'> /var/www/html/sites/default/files/screenshots/team_power listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Jelly Jam <type 'unicode'> /var/www/html/sites/default/files/screenshots/Jelly Jam listdir() [u'lucid-screenshot.PNG', u'lucid-screenshot.PNG.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'lucid-screenshot.PNG', u'lucid-screenshot.PNG.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/quarteto_fantastico <type 'unicode'> /var/www/html/sites/default/files/screenshots/quarteto_fantastico listdir() [u'press.JPG.thumb.jpg', u'press.JPG'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.JPG.thumb.jpg', u'press.JPG'] scan_tree() /var/www/html/sites/default/files/screenshots/Quicksand <type 'unicode'> /var/www/html/sites/default/files/screenshots/Quicksand listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/TippSoc.GGJ.09.3 <type 'unicode'> /var/www/html/sites/default/files/screenshots/TippSoc.GGJ.09.3 listdir() [] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [] scan_tree() /var/www/html/sites/default/files/screenshots/Pega Ladrão <type 'unicode'> Exception in thread FSMonitorThread: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 122, in run self.process_queues() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 155, in process_queues self.__add_dir(path, event_mask) File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 91, in __add_dir FSMonitor.generate_missed_events(self, path) File "/opt/fileconveyor/code/fsmonitor.py", line 128, in generate_missed_events for event_path, result in self.pathscanner.scan_tree(path): File "/opt/fileconveyor/code/pathscanner.py", line 243, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 243, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 225, in scan_tree path = path.decode('utf-8') File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 55: ordinal not in range(128)

wimleers commented 12 years ago

Huh!? That makes no sense!

At line 224 in pathscanner.py, there is the following code:

print "scan_tree()", path, type(path)

That should print something like:

scan_tree() /foo/bar/baz <type 'unicode'>

or

scan_tree() /foo/bar/baz <type 'str'>

But the type is not being printed at all! Please verify that you did not change line 224 and that the type(path) call is still there.

DotConcepts commented 12 years ago

I just check and line 224 is:

print "scan_tree()", path, type(path)

DotConcepts commented 12 years ago

Wim,

I just checked the file on the server and line 224 is exactly the same as in the pathscanner.py file you linked to a few days ago:

print "scan_tree()", path, type(path)

Thank you, Michael Kinney

On Fri, Oct 7, 2011 at 01:37, Wim Leers < reply@reply.github.com>wrote:

Huh!? That makes no sense!

At line 224 in pathscanner.py, there is the following code:

print "scan_tree()", path, type(path)

That should print something like:

scan_tree() /foo/bar/baz <type 'unicode'>

or

scan_tree() /foo/bar/baz <type 'str'>

But the type is not being printed at all! Please verify that you did not change line 224 and that the type(path) call is still there.

Reply to this email directly or view it on GitHub: https://github.com/wimleers/fileconveyor/issues/90#issuecomment-2318238

wimleers commented 12 years ago

Your response is very saddening: it really doesn't make sense that the output does not match the code. Are you sure you're not executing unchanged code and modifying another instance of File Conveyor?

Please change

print "scan_tree()", path, type(path)

to

print "scan_tree()", path, type(path), "blabla"

Hopefully that will help explain why this is happening.

DotConcepts commented 12 years ago

Wim,

As requested I changed the above line (line 224) and I have pasted the output below. As you can see the output appears to be printing what I would have thought was expected:

scan_tree() /var/www/html/sites/default/files/screenshots/Let's Shooting Love <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Let's Shooting Love

Though, it's possible I am missing something.

At this point in time, would it be more productive to allow you access to the development server so that you can see what is going on first hand? This would prevent further slow downs of this type I think. Let me know and I will send you over the necessary credentials.

Thank you again Wim!

Michael Kinney

listdir() [u'Screenshot_0.png.thumb.jpg', u'Screenshot_0.png', u'Screenshot.png.thumb.jpg', u'Screenshot_1.png.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Let's Shooting Love <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Let's Shooting Love listdir() [u'LSL-title.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png", u'LSL-title_1.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png.thumb.jpg", u'LSL-title_0.JPG.thumb.jpg', u'LSL SUPER TITLE.GIF.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'LSL-title.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png", u'LSL-title_1.JPG.thumb.jpg', u"Let's Shooting Love MAIN.png.thumb.jpg", u'LSL-title_0.JPG.thumb.jpg', u'LSL SUPER TITLE.GIF.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Nanosaurus, Inc. <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Nanosaurus, Inc. __listdir() [u'sumo_screenshot_0.JPG.thumb.jpg', u'sumo_screenshot_1.JPG.thumb.jpg', u'sumo_screenshot.JPG.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'sumo_screenshot_0.JPG.thumb.jpg', u'sumo_screenshot_1.JPG.thumb.jpg', u'sumo_screenshot.JPG.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Estados Unidos Mexicana <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Estados Unidos Mexicana listdir() [u'BorderTrouble.jpg', u'BorderTrouble.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'BorderTrouble.jpg', u'BorderTrouble.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Awesome Walrus <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Awesome Walrus listdir() [u'screenshot.JPG.thumb.jpg', u'screenshot.JPG'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot.JPG.thumb.jpg', u'screenshot.JPG'] scan_tree() /var/www/html/sites/default/files/screenshots/48 Hours Later <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/48 Hours Later listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg', u'press_0.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg', u'press_0.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Alex Angels <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Alex Angels listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/The Confusion Factor <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/The Confusion Factor listdir() [u'screenshot_0.jpg.thumb.jpg', u'screenshot_1.jpg.thumb.jpg', u'screenshot_1.jpg', u'screenshot.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot_0.jpg.thumb.jpg', u'screenshot_1.jpg.thumb.jpg', u'screenshot_1.jpg', u'screenshot.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/We Must Protect This House! <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/We Must Protect This House! listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press_0.jpg.thumb.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/R-O-B-O-T-V-O-I-C-E <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/R-O-B-O-T-V-O-I-C-E listdir() [u'screenshot.JPG.thumb.jpg', u'blocking_shadows_0.jpg.thumb.jpg', u'blocking_shadows.jpg.thumb.jpg', u'screenshot_0.JPG', u'blocking_shadows_1.jpg.thumb.jpg', u'screenshot_0.JPG.thumb.jpg', u'blocking_shadows_2.jpg.thumb.jpg', u'blocking_shadows.jpg', u'blocking_shadows_3.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot.JPG.thumb.jpg', u'blocking_shadows_0.jpg.thumb.jpg', u'blocking_shadows.jpg.thumb.jpg', u'screenshot_0.JPG', u'blocking_shadows_1.jpg.thumb.jpg', u'screenshot_0.JPG.thumb.jpg', u'blocking_shadows_2.jpg.thumb.jpg', u'blocking_shadows.jpg', u'blocking_shadows_3.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Shipit Productions <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Shipit Productions listdir() [u'LittleBigFortsScreen.png', u'LittleBigFortsScreen.png.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'LittleBigFortsScreen.png', u'LittleBigFortsScreen.png.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/You Are Fired <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/You Are Fired listdir() [u'kids_room.jpg.thumb.jpg', u'kids_room.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'kids_room.jpg.thumb.jpg', u'kids_room.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Monochrome <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Monochrome listdir() [u'screenshot_0.png.thumb.jpg', u'screenshot.png.thumb.jpg', u'screenshot.png'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'screenshot_0.png.thumb.jpg', u'screenshot.png.thumb.jpg', u'screenshot.png'] scan_tree() /var/www/html/sites/default/files/screenshots/Plead the Fifth <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Plead the Fifth listdir() [u'mini-add.jpg.thumb.jpg', u'mini-add.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} __listdir() [u'mini-add.jpg.thumb.jpg', u'mini-add.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/SafetyScissors2 <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/SafetyScissors2 listdir() [u'Untitled-3.jpg', u'Untitled-3.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'Untitled-3.jpg', u'Untitled-3.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Team Pterodactyl <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Team Pterodactyl __listdir() [u'Press_0.jpg.thumb.jpg', u'Press.jpg.thumb.jpg', u'Press_1.jpg.thumb.jpg', u'Press_1.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'Press_0.jpg.thumb.jpg', u'Press.jpg.thumb.jpg', u'Press_1.jpg.thumb.jpg', u'Press_1.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Crazy Badgers <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Crazy Badgers listdir() [u'la_id_0.jpg', u'la_id_0.jpg.thumb.jpg', u'la_id.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'la_id_0.jpg', u'la_id_0.jpg.thumb.jpg', u'la_id.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/TeamPenguin <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/TeamPenguin listdir() [u'Penguin.jpg.thumb.jpg', u'Penguin.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} __listdir() [u'Penguin.jpg.thumb.jpg', u'Penguin.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Drunkards <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Drunkards listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/f19 <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/f19 listdir() [] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [] scan_tree() /var/www/html/sites/default/files/screenshots/Equipo1ylla <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Equipo1ylla listdir() [u'Gnaka-Gnaka.jpg.thumb.jpg', u'Gnaka-Gnaka.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'Gnaka-Gnaka.jpg.thumb.jpg', u'Gnaka-Gnaka.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/team_power <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/team_power listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/Jelly Jam <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Jelly Jam listdir() [u'lucid-screenshot.PNG', u'lucid-screenshot.PNG.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'lucid-screenshot.PNG', u'lucid-screenshot.PNG.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/quarteto_fantastico <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/quarteto_fantastico listdir() [u'press.JPG.thumb.jpg', u'press.JPG'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.JPG.thumb.jpg', u'press.JPG'] scan_tree() /var/www/html/sites/default/files/screenshots/Quicksand <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/Quicksand listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [u'press.jpg', u'press.jpg.thumb.jpg'] scan_tree() /var/www/html/sites/default/files/screenshots/TippSoc.GGJ.09.3 <type 'unicode'> blabla /var/www/html/sites/default/files/screenshots/TippSoc.GGJ.09.3 listdir() [] changes at this path: {'deleted': Set([]), 'modified': Set([]), 'created': Set([])} listdir() [] scan_tree() /var/www/html/sites/default/files/screenshots/Pega Ladrão <type 'unicode'> blabla Exception in thread FSMonitorThread: Traceback (most recent call last): File "/usr/lib64/python2.6/threading.py", line 532, in bootstrap_inner self.run() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 122, in run self.process_queues() File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 155, in process_queues self.__add_dir(path, event_mask) File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 91, in __add_dir FSMonitor.generate_missed_events(self, path) File "/opt/fileconveyor/code/fsmonitor.py", line 128, in generate_missed_events for event_path, result in self.pathscanner.scan_tree(path): File "/opt/fileconveyor/code/pathscanner.py", line 243, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 243, in scan_tree for subpath, subresult in self.scan_tree(os.path.join(path, filename)): File "/opt/fileconveyor/code/pathscanner.py", line 225, in scan_tree path = path.decode('utf-8') File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 55: ordinal not in range(128)

On Sun, Oct 16, 2011 at 04:59, Wim Leers < reply@reply.github.com>wrote:

Your response is very saddening: it really doesn't make sense that the output does not match the code. Are you sure you're not executing unchanged code and modifying another instance of File Conveyor?

Please change

print "scan_tree()", path, type(path)

to

print "scan_tree()", path, type(path), "blabla"

Hopefully that will help explain why this is happening.

Reply to this email directly or view it on GitHub: https://github.com/wimleers/fileconveyor/issues/90#issuecomment-2419969

wimleers commented 12 years ago

The client of DotConcepts has now contacted me directly to look into the problem and they've given me full server access.


I've found one more bug: the SpacesToUnderscores and SpacesToDashes processors were broken (nobody noticed because typically nobody uses these).


The root problem at hand here turned out to be the following:

>>> a = u"foobar"
>>> a.decode('utf-8')
u'foobar'
>>> b = u"\xe3"
>>> print b
ã
>>> b.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe3' in position 0: ordinal not in range(128)
>>> 

The solution: remove the path.decode() statements from pathscanner.py. Every string passed in there ought to already be a Unicode string anyway. This is a left-over from the initial "Unicode fix": 12f2ddffdf83323f9cdad68760594482b1d34c7f.