Open GoogleCodeExporter opened 9 years ago
On further review, I can sort of see what needs to be done, as far as calling
__do_xml_page once for every 10 messages and somehow concatenating the data
from each
page.
I definitely can't make enough sense of the code to implement that though.
Original comment by smcgrat...@gmail.com
on 29 Nov 2009 at 6:06
You should be able to set the page attr in the data passed into __do_xml_page.
like
so: voice.__do_xml_page('all', {'page':'p3'}). Give that a try, i do not have
enough
messages to test pagination, but passing the var as a POST or GET param should
help.
Original comment by justquick
on 29 Nov 2009 at 7:56
I'd rate this as high priority, since after a while, the program will no longer
retrieve new messages, just the same old ones. Any progress on this?
Testing this requires more than 10 conversations, not more than 10 messages. I
have
one conversation with 22 messages, and they all show up in the inbox XML if any
of
them do.
If you have old conversations in your Google Voice trash folder, you can
undelete
them, which puts them back in the inbox and can force the inbox to multiple
pages.
So there's a way to test.
Original comment by na...@animats.com
on 12 Jan 2010 at 6:45
>> You should be able to set the page attr in the data passed into
__do_xml_page. like
>> so: voice.__do_xml_page('all', {'page':'p3'}).
1. There is no "__do_xml_page". There is an "__get_xml_page".
2. The "data" parameter to "__get_xml_page", when set to "{'page': 'p1'}",
results
in an exception in XMLParser, even though there is a valid page p1.
Original comment by na...@animats.com
on 14 Jan 2010 at 7:38
The problem with "2." above is that "__do_page", if given a "data" parameter,
goes a
POST instead of a GET, sending the "data" info in the headers, not the URL.
Except
for page types "DOWNLOAD" and "XML SEARCH", which are always a GET.
All that un-commented cutesy stuff with attributes makes this hard to fix with
small
changes. There are too many implicit assumptions about how Google Voice will
behave
nailed into the code. When something needs an extra parameter, but it was
implemented
as an attribute, the design of pygooglevoice breaks down.
Original comment by na...@animats.com
on 14 Jan 2010 at 4:32
Specifying "page" as above generates this URL:
DEBUG:PyGoogleVoice:/voice/inbox/recent/inbox/?_rnr_se=5BilkW7VQpUi5EDSCHmk%2Flb
Y2mc%3D&page=p2
- {'User-Agent': 'PyGoogleVoice/0.5
}
So the "_rnr_se" parameter is added only if "page" is specified. Google Voice
returns a "403 Forbidden" error in this situation.
Just requesting "https://www.google.com/voice/inbox/recent/inbox/?page=p2" with
Firefox works fine. Unclear why.
Original comment by na...@animats.com
on 14 Jan 2010 at 5:20
I have multiple page fetch working now. "__do_page" in "voice.py" needs some
work.
The "debug message" above is misleading; the debug print is inserting a "?",
but the
actual URL generated doesn't have it; the actual URL sent out looked like
"...inbox/_rn_se", without the "?". Some other pygooglevoice functions may be
broken
because of that. Do "DOWNLOAD" and "XML_SEARCH" work? I suspect not.
Once "__do_page" has been fixed to do a GET with a properly constructed URL in
this situation, we can fetch pages > 1. Code for this looks like
def fetchfolderpage(voice, pagetype, pagenumber=1) : # fetch page N (starting
from 1)
of inbox
params = None # params for fetching page, if any
if pagenumber > 1 : # if not first page, must put page number in URL
params = {'page' : "p" + str(pagenumber)} # get page "p2", etc.
####print("Page: " + repr(params)) # ***TEMP***
xmlparser = voice._Voice__get_xml_page(pagetype, params) # violate class privacy per
developer instructions
return (xmlparser) # return XML parser object
This is painful and ugly.
Fetching multiple pages properly requires reading the HTML, and looking for the
"next
page" link to see if there's more to read. I use BeautifulSoup for that,
but pygooglevoice doesn't normally parse the HTML, so that's a problem. If
there's
an "a" tag with an "id" attribute with a value of "gc-inbox-next", there are
more
pages to read. In BeautifulSoup notation:
moreitem = tree.find("a",attrs = {"id" : "gc-inbox-next"})
If "moreitem" is not null, there are more pages to be read.
I'll do more cleanup on this. It's definitely fixable, but it doesn't fit well
into
the structure of pygooglevoice.
Original comment by na...@animats.com
on 14 Jan 2010 at 8:01
I think I see how to do this:
1. The API needs some changes. I propose to
give XMLParser an optional "pagenumber" parameter
which is then used in its lambda to get the desired page. So the user
can write "voice.inbox(pagenumber=2)" to get page 2 of the inbox.
Default is 1, this being the Google Voice convention. This maintains
compatibility with existing user code.
There's no good way to detect the last page without looking at the HTML,
other than getting an exception on a bad page number. See my previous
note. So detecting the last page is the caller's responsibility for now.
2. All the "helper" functions in "voice.py" need, instead of one "data"
parameter,
a "urldata" and a "postdata" parameter. "url" data gets urlencoded and
appended to the URL; "postdata" gets sent as part of a POST. If "postdata"
is not None, an HTTP POST will be performed, otherwise a GET.
All the callers of these functions need to be modified. The current hack
in __do_page, "if page in ('DOWNLOAD','XML_SEARCH')", goes away, and the
caller makes the GET/POST decision.
3 Current exception handling in __call__ of XMLParser turns
all exceptions into ParsingError. Because XMLParser's lambda does network
I/O, this hides network errors. You get "Parsing Error" when Google Voice
gave you "403 Forbidden", for example. Exception handling there should pass
through HTTP and OS errors. Then you can tell the difference between "network
problem", "Google changed the API", and "pygooglevoice is broken".
The problem is that this requires many small changes all over pygooglevoice,
and I'm
not set up to test it properly other than for SMS. How can we get this done?
Original comment by na...@animats.com
on 15 Jan 2010 at 5:28
Here's code for a workaround. I do NOT recommend putting this directly into
"pygooglevoice", and it has NOT been tested for non-SMS functions. But I've
succeessfully received three pages worth of SMS messages with this code.
Original comment by na...@animats.com
on 15 Jan 2010 at 5:58
Here's a patch override file for a workaround. This doesn't affect the
installed
"pygooglevoice", it just replaces some functions for your application. This
has NOT
been tested for non-SMS functions. This is NOT a permanent fix; I'll leave that
to
the developer. It will read multiple pages of inbox SMS, if you explicitly call
"fetchfolderpage" for each page.
This applies to googlevoice 0.5 only.
Original comment by na...@animats.com
on 15 Jan 2010 at 7:27
Attachments:
[deleted comment]
Here's an alternative patch, in case anyone is interested. I have applied it to
the current pygooglevoice source code. For my own rudimentary testing
scenarios, it seems to work fine.
It allows an optional page-number parameter to be supplied to the
voice.inbox(), voice.starred(), voice.sms(), voice.all(), voice.spam(),
voice.voicemail(), and voice.trash() methods.
For example, after applying this patch, you can now do this to retrieve all of
the SMS conversations on page 17:
voice.sms(17)
Invoking "voice.sms()" will still work as it currently does, and it will
retrieve the converations on page 1. The same is true for all the other changed
methods.
This applies to pygooglevoice 0.5. For any other version, YMMV.
Original comment by hippo.ma...@gmail.com
on 17 Nov 2010 at 1:22
Attachments:
PS: Here's a script I wrote which makes use of this patched version of
pygooglevoice-0.5. It traverses the entire SMS folder within a given Google
Voice instance and it builds a data structure which contains all SMS messages
within all the conversations on all the pages.
It's very loosely based on the examples/parse_sms.py program.
Like that example, it requires the BeautifulSoup XML parsing library. I used
this version:
http://www.crummy.com/software/BeautifulSoup/download/3.x/BeautifulSoup-3.0.8.1.tar.gz
Original comment by hippo.ma...@gmail.com
on 17 Nov 2010 at 2:15
Attachments:
I've forked this project and incorporated the patch suggested in Comment 12.
http://code.google.com/r/fracai-pygooglevoice/
Original comment by fra...@gmail.com
on 7 Apr 2011 at 2:15
Ah, the abandonware problem.
That's not a great patch, just what I could do from the
outside without redesigning the code.
I finally gave up on Google Voice and switched to Twilio as my
SMS gateway. Twilio isn't free, but it does SMS much better.
A no-traffic poll of Google Voice transmits all that XML for at
least 10 messages, so the overhead and data usage during periods
of light traffic is very high. And you have to jump through hoops
to eliminate duplicate messages.
John Nagle
Original comment by na...@animats.com
on 7 Apr 2011 at 4:11
It's actually not using your patch (from #10), but the one from 12. It looked
pretty clean to me.
Original comment by fra...@gmail.com
on 7 Apr 2011 at 11:05
I have been working with all this code. I have made little to no progress.
Here is what I have:
I receive SMS messages that contain geo-lcation data. I want to constantly down
load sms messages to an updated csv that can feed to a kmz for google earth can
anyone help?
XMAN
Original comment by Xavier.J...@gmail.com
on 2 May 2011 at 2:47
I made a branch for gathering all pages from a google voice search query:
http://code.google.com/r/eahutchins-searchpage/
This is useful for dumping full call logs, the search example script now dumps
the date, time and number from each matching record.
Original comment by E.A.Hutc...@gmail.com
on 8 Sep 2011 at 11:39
Original issue reported on code.google.com by
smcgrat...@gmail.com
on 29 Nov 2009 at 5:15