coddingtonbear / python-myfitnesspal

Access your meal tracking data stored in MyFitnessPal programatically
MIT License
794 stars 138 forks source link

IndexError in _get_goals #22

Closed ojcm closed 8 years ago

ojcm commented 9 years ago

Hello @coddingtonbear, I believe I have found an issue when running _get_goals. In fact, the same issue as #12 . I am running v1.6 (downloaded today) and Python version 2.7.10. See extract below (username and password changed).

$ python
Python 2.7.10 (default, Jul 14 2015, 19:46:27) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import myfitnesspal
>>> client = myfitnesspal.Client('username','password')
>>> day = client.get_date(2015,1,1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/myfitnesspal-1.6-py2.7.egg/myfitnesspal/client.py", line 222, in get_date
    goals = self._get_goals(document)
  File "/Library/Python/2.7/site-packages/myfitnesspal-1.6-py2.7.egg/myfitnesspal/client.py", line 122, in _get_goals
    total_header = document.xpath("//tr[@class='total']")[0]
IndexError: list index out of range

I have checked client.py on my system and it includes the fix to #12.

I am happy to provide additional information if required.

@ojcm

coddingtonbear commented 9 years ago

The method this library uses for interacting with myfitnesspal is a little less than...stable. Basically any change to the underlying HTML will cause things to break in exactly the way you've found (and was previously found in #12) because this operates as an HTML scraper given the difficulty in obtaining an API key.

The bad news: it looks like this whole page was changed pretty substantially, so the existing methods for fetching goals will have to be updated.

The good news: it looks like they are using an (?undocumented) API endpoint for fetching goals -- https://api.myfitnesspal.com/v2/nutrient-goals -- so re-writing this to use that would definitely make things more predictable in the future.

I'm not sure I have time to look into this right now, but if you wanted to post a PR re-writing this functionality such that it uses the aforementioned API endpoint, I can promise a fairly quick review.

Cheers!

coddingtonbear commented 8 years ago

I'm under the assumption that this was fixed via https://github.com/coddingtonbear/python-myfitnesspal/pull/13; please let me know if that's not true!

Cheers, Adam

prio commented 8 years ago

Still an issue for me in 1.7.1 (downloaded today)

coddingtonbear commented 8 years ago

Thanks for the heads-up, @prio.

coddingtonbear commented 8 years ago

@prio -- I'm not able to replicate this issue for you; it looks like it's working just fine for me. Could you elaborate on what steps you're taking -- here's what I just did:

In [1]: from myfitnesspal import Client

In [2]: c = Client('<my username>')

In [3]: date = c.get_date(2015, 1, 25)

In [4]: date.goals
Out[4]:
{'calories': 2430,
 'carbohydrates': 304,
 'fat': 81,
 'protein': 122,
 'sodium': 2300,
 'sugar': 91}
coddingtonbear commented 8 years ago

For the moment I'm going to close this under the assumption that maybe you have two versions of myfitnesspal installed (and are in fact running the earlier version rather than the later). Please do reply if you can post an example of the steps you're going through to reproduce this error.

First, though, you may find more luck if you follow the standard steps for cleaning up your python environment:

  1. Run pip uninstall myfitnesspal multiple times until it shows that it has not uninstalled anything. This may happen more than once!
  2. Run pip install myfitnesspal
  3. Try running through the same steps I did in my earlier post for gathering the results for a given day.

Let me know if you need any help.

Cheers!

prio commented 8 years ago

Hi,

Only ever had one version installed (the latest).

$ pip freeze | grep -i myfitness
myfitnesspal==1.7.1
$ ipython
In [1]: import myfitnesspal
In [2]: client = myfitnesspal.Client('jon...org')
In [3]: client.get_date(2015, 1, 24)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-4-8b5dbcbf70d3> in <module>()
----> 1 client.get_date(2015, 1, 24)

/Users/jonathan/.pyenv/versions/anaconda-2.2.0/lib/python2.7/site-packages/myfitnesspal/client.pyc in get_date(self, *args, **kwargs)
    223
    224         meals = self._get_meals(document)
--> 225         goals = self._get_goals(document)
    226         notes = self._get_notes(document)
    227         water = self._get_water(document)

/Users/jonathan/.pyenv/versions/anaconda-2.2.0/lib/python2.7/site-packages/myfitnesspal/client.pyc in _get_goals(self, document)
    123
    124     def _get_goals(self, document):
--> 125         total_header = document.xpath("//tr[@class='total']")[0]
    126         goal_header = total_header.getnext()  # The following TR contains goals
    127         columns = goal_header.findall('td')

IndexError: list index out of range

How can I get what url it is failing on and I will try the xpath manually?

coddingtonbear commented 8 years ago

In this case, it'd be your /food/diary/<yourusername>?date=2015-01-24 page, but you can get the exact URL by following the following steps:

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)

python-myfitnesspal uses the popular 'requests' library for its interactions with MyFitnessPal, and that library itself logs each request and the resultant status code, so if in the same session you were to re-instantiate your myfitnesspal.Client instance as you did above, you'll receive a little more debugging information this time:

>>> import myfitnesspal
>>> client = myfitnesspal.Client('<yourusername>')
INFO:requests.packages.urllib3.connectionpool:Starting new HTTPS connection (1): www.myfitnesspal.com
DEBUG:requests.packages.urllib3.connectionpool:"GET /account/login HTTP/1.1" 200 11657
DEBUG:requests.packages.urllib3.connectionpool:"POST /account/login HTTP/1.1" 302 None
DEBUG:requests.packages.urllib3.connectionpool:"GET /en HTTP/1.1" 302 95
DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 302 None
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): www.myfitnesspal.com
DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 None
>>> client.get_date(2015, 1, 24)
DEBUG:requests.packages.urllib3.connectionpool:"GET /food/diary/<yourusername>?date=2015-01-24 HTTP/1.1" 200 None
<01/24/15 {}>
>>>
prio commented 8 years ago

Ok, think I have it. When I was setting my username I was using my log in name, my email address, rather than the username myfitnesspal assigned to me. It is working now. It may be worth stating this in the README. Thanks for your help.

coddingtonbear commented 8 years ago

Ahh! Nice -- I had no idea that that was a thing at all. I've opened #27 to look into that; thanks for bearing with me on this -- I'm glad we got to the root of the problem!

coddingtonbear commented 8 years ago

OK @prio -- the above thing should never be a problem again after the most recent release (1.8.0).

Cheers!