Closed rdbeni0 closed 7 years ago
Can you try 2 things for me please?
Add server.debug = True
before the server.gmail_search()
call and report the output here.
Wrap proba0 in a list like this: server.gmail_search([proba0], charset='UTF-8')
. I see a potential bug when a bare string is passed and that should work around it. If that works I'll make a fix.
ok, I changed string proba0
into list and add server.debug = True
:
server.debug = True
proba0 = ['from:(noreply@olx.pl) subject:(Wiadomość do ogłoszenia)']
proba1 = server.gmail_search([proba0], charset='UTF-8')
print(proba0.encode(encoding='utf_8'));
print(proba1);
output :
Traceback (most recent call last):
File "/home/collector1871/DEV/python/emailCleaner/emailcleanerfunctions.py", line 46, in gmailcleaner
proba1 = server.gmail_search([proba0], charset='UTF-8')
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 741, in gmail_search
return self._search([b'X-GM-RAW', query], charset)
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 747, in _search
args.extend(_normalise_search_criteria(criteria, charset))
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1270, in _normalise_search_criteria
return [_handle_one_search_criteria(item, charset) for item in criteria]
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1270, in <listcomp>
return [_handle_one_search_criteria(item, charset) for item in criteria]
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1278, in _handle_one_search_criteria
return _maybe_quote(to_bytes(item, charset))
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1291, in _maybe_quote
out = arg.replace(b'\\', b'\\\\')
AttributeError: 'list' object has no attribute 'replace'
You've wrapped the search string in a list twice! Can you fix and try again?
by the way:
~ % pip show imapclient
Name: IMAPClient
Version: 1.0.2
Summary: Easy-to-use, Pythonic and complete IMAP client library
Home-page: http://imapclient.freshfoo.com/
Author: Menno Smits
Author-email: menno@freshfoo.com
License: http://en.wikipedia.org/wiki/BSD_licenses
Location: /usr/lib/python3.6/site-packages
Requires: six, backports.ssl, mock, pyopenssl
~ %
I am not sure what do you mean by "wrapped the search string in a list twice". Now proba0
is variable with utf8 string.
Code:
server.debug = True
proba0 = 'from:(noreply@olx.pl) subject:(Wiadomość do ogłoszenia)'
proba1 = server.gmail_search([proba0], charset='UTF-8') #proba0 is inside []
print(proba0.encode(encoding='utf_8'));
print(proba1);
Output:
File "/home/collector1871/DEV/python/emailCleaner/emailcleanerfunctions.py", line 46, in gmailcleaner
proba1 = server.gmail_search([proba0], charset='UTF-8')
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 741, in gmail_search
return self._search([b'X-GM-RAW', query], charset)
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 747, in _search
args.extend(_normalise_search_criteria(criteria, charset))
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1270, in _normalise_search_criteria
return [_handle_one_search_criteria(item, charset) for item in criteria]
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1270, in <listcomp>
return [_handle_one_search_criteria(item, charset) for item in criteria]
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1278, in _handle_one_search_criteria
return _maybe_quote(to_bytes(item, charset))
File "/usr/lib/python3.6/site-packages/imapclient/imapclient.py", line 1291, in _maybe_quote
out = arg.replace(b'\\', b'\\\\')
AttributeError: 'list' object has no attribute 'replace'
And new output with server.debug = True
. Now proba0
is wrapped as standard UTF-8 string.
code:
server.debug = True
proba0 = 'from:(noreply@olx.pl) subject:(Wiadomość do ogłoszenia)'
proba1 = server.gmail_search(proba0, charset='UTF-8') #without []
print(proba0.encode(encoding='utf_8'));
print(proba1);
output:
21:00.551931 > OIBM6 UID SEARCH CHARSET UTF-8 X-GM-RAW {60}\r\n
21:00.853282 < b'+ go ahead'
(literal) > "from:(noreply@olx.pl) subject:(Wiadomo\xc5\x9b\xc4\x87 do og\xc5\x82oszenia)"
21:01.168769 < b'* SEARCH'
21:01.169109 < b'OIBM6 OK SEARCH completed (Success)'
b'from:(noreply@olx.pl) subject:(Wiadomo\xc5\x9b\xc4\x87 do og\xc5\x82oszenia)'
[]
I can reproduce the problem by sending myself an email with the same subject that you're testing with.
I can also see the cause. UTF-8 strings get sent as IMAP literals but the search expression has already been quoted before the literal is sent. Gmail sometimes fails to match search expressions sent as literals which have double quotes around them.
I'm not quite sure what the right fix is going to be. I'll have a think about it.
python 3.6 :
output (2 lines of 2x print):
And when i am using pure ascii (without polish letters), then it is finding something. Search query is correct (checked via gmail web).
Any idea how can I use utf8 string in search query?