Document is empty - Githubissues

Fluxator commented 7 years ago

Sorry, this is probably just a screw-up on my side. I can access my account via the command line, but when I copy/paste the very first example as a script I get a ParserError: Document is empty

coddingtonbear commented 7 years ago

Hrm -- it's a little hard to say; could you post a full traceback? If I had to guess, though, I'd wonder if maybe you haven't stored a password for the username you've entered.

Fluxator commented 7 years ago

It seems like I don't have this issue on my Linux machine. The one with the problem is a windows machine. I tried both, the password in the Client call and in the keyring and I get another error when I leave out both.

Traceback (most recent call last):

  File "<ipython-input-1-915f86be6fed>", line 1, in <module>
    runfile('D:/user/Documents/1 Code/Python/mfp.py', wdir='D:/user/Documents/1 Code/Python')

  File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 685, in runfile
    execfile(filename, namespace)

  File "C:\Python27\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
    exec(compile(scripttext, filename, 'exec'), glob, loc)

  File "D:/user/Documents/1 Code/Python/mfp.py", line 12, in <module>
    client = myfitnesspal.Client('notMyUserName', password='notMyRealPW')

  File "C:\Python27\lib\site-packages\myfitnesspal\client.py", line 51, in __init__
    self._login()

  File "C:\Python27\lib\site-packages\myfitnesspal\client.py", line 79, in _login
    document = self._get_document_for_url(login_url)

  File "C:\Python27\lib\site-packages\myfitnesspal\client.py", line 204, in _get_document_for_url
    return lxml.html.document_fromstring(content)

  File "C:\Python27\lib\site-packages\lxml\html\__init__.py", line 617, in document_fromstring
    "Document is empty")

ParserError: Document is empty

coddingtonbear commented 7 years ago

That's...interesting; can you try running the following from your python prompt:

import requests
print requests.get('https://google.com/').content

Fluxator commented 7 years ago

I tried it on both machines and it worked just fine. The output is quite lengthy, so I did not post/compare, but I guess the important point is that I'm getting something that looks like a google page.

coddingtonbear commented 7 years ago

It looks like the actual URL it's trying to fetch is this one -- https://www.myfitnesspal.com/account/login -- could you this:

import requests
print requests.get('https://www.myfitnesspal.com/account/login').content

Fluxator commented 7 years ago

Not sure what to look for. I tried it and it gives me the html body of the requested site. Looks fine.

coddingtonbear commented 7 years ago

I'm a little mystified; the error you're getting from python-myfitnesspal indicates that it received no data when making a similar request; is there anything different between the environment you're running these tests in for me and the environment python-myfitnesspal is running in?

Fluxator commented 7 years ago

I'm doing this in Spyder in a Windows 7 environment. I tried both the console and a separate script for the snippet, just to be sure. They both seem to work

coddingtonbear commented 7 years ago

I'm not sure exactly how to help you directly with this at this point, but I can give you a tip you can use for investigating this problem on your own. If you aren't familiar with ipdb or pdb, you may have to do a tiny bit of research, but it's really not very complicated. What i'd recommend is dropping an ipdb or pdb trace around this line to try stepping through the calls to see what might be amiss. The request you tested above should be exactly identical to the request that is generated by what this line calls. Let me know if you find anything interesting, @Fluxator!

Fluxator commented 7 years ago

The url looks fine.

In (this function)[https://github.com/coddingtonbear/python-myfitnesspal/blob/master/myfitnesspal/client.py#L614] after parser is set to html_parser the function returns None. html was the text of the login page for MFP. I tested it and it looks alright. kw is empty.

coddingtonbear commented 7 years ago

Could you try posting a link again? It doesn't look like there's a line 614 in client.py.

Fluxator commented 7 years ago

Oh sorry, I missed the point when I left client.py and ended up in __init__.py

client.py was left in (this line)[https://github.com/coddingtonbear/python-myfitnesspal/blob/master/myfitnesspal/client.py#L204] with the correct url and fine-looking content.

coddingtonbear commented 7 years ago

Could you double-check that that is actually returning None on that line, and verify that content at that point does actually contain a bunch of webpage-y text? The reason I'm asking you to double-check is that your original traceback indicates that that call to lxml.html.document_fromstring doesn't return, and instead raises the ParserError: Document is empty exception.

Fluxator commented 7 years ago

It's this call value = etree.fromstring(html, parser, **kw) in __init__.py that I was referring to with the None return.

I checked content again by pasting it in a html file and it is the login page for MFP.

coddingtonbear commented 7 years ago

Wait; hrm; I don't think we use etree directly at all in python-myfitnesspal; could you send a link to where you're looking?

Fluxator commented 7 years ago

No, this call was in __init__.py which seems to be part of lxml.html.

The last call of python-myfitnesspal was in client.py in line 204.

I checked the call there and it seems to be fine. I don't know anything about the lxml.html stuff but I think it is odd that this call in the python prompt

import lxml.html
lxml.html.document_fromstring("this is not empty")

results in the same error (Document is empty)

coddingtonbear commented 7 years ago

Yeah; it's pretty clear at this point that the problem isn't in this library but in etree or lxml; I'm not aware enough of how those tools work on Windows to help you a whole lot more, but googling for etree and Document is Empty shows that at least a few other people have bumped into similar problems.

So, since the problem you're experiencing is not in this module, but instead in a dependency, I'm going to close this issue here. That said -- don't feel like I'm abandoning you! I'd be glad to help you out with this via the gitter channel to help you troubleshoot. I'm @coddingtonbear there just as I am here, so feel free to start a conversation with me over there!

Cheers!

coddingtonbear / python-myfitnesspal

Document is empty #44