coddingtonbear / python-myfitnesspal

Access your meal tracking data stored in MyFitnessPal programatically
MIT License
789 stars 138 forks source link

Login now includes a hidden captcha that prevents python-myfitnesspal from logging-in #144

Closed andystev closed 1 year ago

andystev commented 1 year ago

Morning,

I wonder if something API-side has changed? Up to last night I was connecting fine, now this:

Traceback (most recent call last):
  File "/home/andy/MFP/importmyfitnesspal.py", line 19, in <module>
    client = myfitnesspal.Client('############', password='###################')
  File "/usr/local/lib/python3.10/dist-packages/myfitnesspal/client.py", line 80, in __init__
    self._login()
  File "/usr/local/lib/python3.10/dist-packages/myfitnesspal/client.py", line 132, in _login
    raise MyfitnesspalLoginError()

Any thoughts?

chazzaspazz commented 1 year ago
image

Also having this issue too. Looks like maybe an update bricked it...

ColorfulQuark commented 1 year ago

I'm having the same problem. It does look like MFP changed something about the login process.

bverem commented 1 year ago

It looks like there's an invisible recaptcha.

coddingtonbear commented 1 year ago

OK... good and bad news!

The good news is that I've updated this library such that it's again able to access MyFitnessPal on your behalf. You can find those changes in version 2.0.0 of this library.

The bad news is that, as you might guess from the major version bump, the way we access login credentials had to be changed in ways that may limit how useful this library is to you. I'm really sorry about that, but I don't see an alternative at the moment.

Now that MyFitnessPal has added a hidden captcha to their log in flow, this library will no longer be able to log in directly in the manner it was historically doing so. Instead, this library now uses the browser_cookie3 library for gathering cookies from your local browser for use when interacting with MyFitnessPal.

To be a little more concrete:

If any of you have any clever ideas of other ways of getting around this limitation, I'd love to hear them!

Cheers & good luck!

kquinsland commented 1 year ago

There are some people that use this library indirectly via a Home Assistant integration. For those people, you may want to consider adding a way to ignore the cookiejar and just have the user supply the values directly.

For some context: https://github.com/helto4real/custom_component_myfitnesspal/issues/25

coddingtonbear commented 1 year ago

You can absolutely instantiate your own cookiejar and hand it directly to myfitnesspal.Client -- just import the class from http.cookiejar.CookieJar and set the cookies you need. See the cookiejar parameter on the client here: https://python-myfitnesspal.readthedocs.io/en/latest/api/client.html.

Unfortunately, though, you're still going to need to log-in in an actual browser somewhere to find your session token given that the log-in flow has that invisible captcha. There is unfortunately no avenue forward via which we can return to accepting a username/password combination directly as we were before while a captcha is present.

kquinsland commented 1 year ago

There is unfortunately no avenue forward via which we can return to accepting a username/password combination directly as we were before while a captcha is present.

I did briefly try to de-compile the latest android APK to see what api endpoint mobile clients were using to request auth as there's no captcha there but ran out of time before I could really get my head around all the decompiled code. Probably for the better, it wouldn't be too difficult for MFP to detect/prevent unauthorized use of the mobile login endpoint :/.

Thanks for spelling out the cookie jar details... it shouldn't be too difficult to get the HA <-> MFP integration working again even if it means walking the user through browser debug/inspector tooling :/.

danleonard-nj commented 1 year ago

@coddingtonbear howdy and thanks for all your work on this library, know I've gotten a lot of use out of it.

I was pulling data on a scheduler in a pod in K8S so the browser option wasn't going to work for me, so I spun up something w/ a headless webdriver that works fairly well.

Obviously comes w/ all the standard pitfalls of any web automation like this, but it gets past the reCAPTCHA and logs in consistently. Not sure if something like this would really belong in the library considering the dependency on the webdriver, but figured I would share in case it's useful to others.

It supports saving/loading credentials so ultimately should only have to actually login once in a while (think 30 days is the auth token expiration or something like that).

Here's an example of how I'm using it:

from webdriver_auth import SeleniumAuth
import myfitnesspal

auth = SeleniumAuth(
    username='username',
    password='password',
    webdriver_path=r'C:\chromedriver.exe',
    creds_filepath='creds.json',
    use_stored_credentials=True)

def get_mfp_client():
    if auth.login(max_wait_time=30):
        return myfitnesspal.Client(
            cookiejar=auth.cookiejar)

mfp_client = get_mfp_client()
meals = mfp_client.get_meals()

Auth client:

import json
import logging
import os
from typing import Tuple

import requests
from requests.cookies import RequestsCookieJar
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.remote.webelement import WebElement
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from seleniumwire import webdriver

logger = logging.getLogger()

def first(items, func):
    for item in items:
        if func(item):
            return item
    return None

def any(items, func):
    for item in items:
        if func(item):
            return True
    return False

def foreach(items, func):
    for item in items:
        func(item)

def empty(items):
    if items is not None and len(items) > 0:
        return True
    return False

wire_logger = logging.getLogger(
    'seleniumwire.handler').setLevel(level=logging.WARNING)

class SeleniumAuthConstants:
    LOGIN_URL = 'https://www.myfitnesspal.com/account/login'
    LOGGER_NAME = 'seleniumauth'
    TERMINAL_REQUEST = 'featured_blog_posts'

class Element:
    EMAIL_NAME = 'email'
    PASSWORD_NAME = 'password'
    SUBMIT_XPATH = "//button[@type='submit']"
    TOOLBAR_XPATH = "//span[contains(@class,'MuiContainer-maxWidthLg')]"

class SeleniumAuth:
    @property
    def cookies(
        self
    ) -> list[dict]:
        '''
        The raw cookie data returned from the
        webdriver requests or stored credentials
        '''

        return self.__cookies

    @property
    def cookiejar(
        self
    ) -> RequestsCookieJar:
        '''
        A `RequestsCookieJar` with the cookies captured 
        from the webddriver or from loaded credentials.  
        '''

        return self.__cookiejar

    def __init__(
        self,
        username: str,
        password: str,
        webdriver_path: str,
        creds_filepath: str = None,
        use_stored_credentials: bool = True
    ):
        self.__username = username
        self.__password = password
        self.__webdriver_path = webdriver_path
        self.__creds_filepath = creds_filepath
        self.__use_stored_credentials = use_stored_credentials

        self.__driver: webdriver.Chrome = None

        self.__cookies = list()
        self.__cookiejar = None

    def __initialize_driver(self) -> webdriver.Chrome:
        ''' Create and configure the Chrome webdriver '''

        options = webdriver.ChromeOptions()
        options.headless = True

        driver = webdriver.Chrome(
            self.__webdriver_path,
            options=options)

        driver.set_window_size(1920, 1080)
        driver.maximize_window()

        self.__driver = driver

    def __navigate(
        self
    ) -> None:
        '''
        Navigate to service login page
        '''

        self.__driver.get(
            url=SeleniumAuthConstants.LOGIN_URL)

    def __get_login_input_elements(
        self
    ) -> Tuple[WebElement, WebElement]:
        ''' Get the login username and password elements '''

        logger.info(f'Waiting for email element')
        email = self.__get_element(
            _type='name',
            selector=Element.EMAIL_NAME)

        logger.info(f'Waiting for password elemenet')
        password = self.__get_element(
            _type='name',
            selector=Element.PASSWORD_NAME)

        return email, password

    def __create_cookiejar(
        self,
        cookies=None
    ) -> Tuple[RequestsCookieJar, dict]:
        '''
        Create a cookiejar from the captured driver cookies and
        return the created cookie jar and the source dictionary

        '''

        raw_cookies = cookies or self.__driver.get_cookies()

        if self.__driver:
            self.__driver.quit()

        if not empty(raw_cookies or []):
            raise Exception('Failed to capture driver cookies')

        # Convert the dictionary of cookie values returned from
        # the driver or loaded from stored credentials into a
        # requests cookiejar (derived from http.CookieJar used
        # by MFP client)
        logger.info(f'Creating requests session cookiejar')
        session = requests.session()

        foreach(raw_cookies, lambda cookie: session.cookies.set(
            name=cookie.get('name'),
            value=cookie.get('value'))
        )

        self.__cookiejar = session.cookies
        self.__cookies = raw_cookies

        if (self.__use_stored_credentials
                and self.__creds_filepath is not None):

            self.__save_credentials()

    def __get_element(
        self,
        _type: str,
        selector: str,
        max_wait_time=10
    ) -> WebElement:
        '''
        Explicitly wait for an element to become available
        on the page and return the element

        `type`: element type
        `value`: element path
        `max_wait_time`: maximum time to wait before timeout
        '''

        locator = (_type, selector)

        element = WebDriverWait(
            driver=self.__driver,
            timeout=max_wait_time).until(
                method=EC.presence_of_element_located((
                    locator)))

        return element

    def __get_submit(
        self
    ) -> WebElement:
        ''' Get the login page submit button element '''

        logger.info(f'Locating submit element')
        submit = self.__get_element(
            _type='xpath',
            selector=Element.SUBMIT_XPATH)

        logger.info(f'Submit element: {submit}')

        if isinstance(submit, list):
            return submit[0]

        return submit

    def __get_distinct_request_urls(
        self
    ) -> list[str]:
        '''
        Get a list of all distinct request URLs from
        the webdriver
        '''

        urls = set([
            req.url for req in self.__driver.requests
        ])

        return list(urls)

    def __wait_for_login(
        self,
        max_wait_time: int
    ) -> None:
        '''
        Wait for the login to complete indicated by
        the toolbar element becoming available on the
        page before exiting.  Otherwise seleniumwire
        may not capture the full network trace for the
        login and the required cookies

        `max_wait_time` : maximum time to wait on the
        toolbar element, this may
        '''

        # Use the XHR request that fetches the spash page's blog
        # post list as an idicator that we've captured all the
        # auth-related requests.  Otherwise the driver may exit
        # before this happens

        def terminal_request_receieved(driver: webdriver.Chrome):
            distinct_urls = self.__get_distinct_request_urls()

            if any(items=distinct_urls,
                   func=lambda url: SeleniumAuthConstants.TERMINAL_REQUEST in url):
                logger.info(f'Terminal request received')
                return True

        WebDriverWait(
            driver=self.__driver,
            timeout=max_wait_time).until(
                method=terminal_request_receieved)

        logger.info(f'Terminal login request receieved')

    def __login(
        self,
        max_wait_time: int
    ) -> RequestsCookieJar:
        '''
        Internal login routine, to be wrapped in try/except
        to capture and return specific exceptions concerning
        timeouts
        '''

        # Attempt to load stored credentials and use those if
        # available
        if (self.__use_stored_credentials
                and os.path.exists(self.__creds_filepath)):

            logger.info('Using stored credentials')
            return self.load_credentials()

        logger.info('No stored credentials found, using webdriver')

        self.__initialize_driver()
        self.__navigate()

        usr, pwd = self.__get_login_input_elements()

        usr.send_keys(self.__username)
        pwd.send_keys(self.__password)

        submit = self.__get_submit()
        submit.click()

        self.__wait_for_login(
            max_wait_time=max_wait_time)

        self.__create_cookiejar()

    def __save_credentials(
            self
    ) -> None:
        '''
        Export and save a JSON file containing the raw
        cookie data
        '''

        with open(self.__creds_filepath, 'w') as file:
            file.write(json.dumps(self.__cookies))

    def load_credentials(
        self
    ) -> bool:
        '''
        Load credentials from the provided filepath if
        the file exists
        '''

        if not os.path.exists(self.__creds_filepath):
            raise Exception(
                f"Could not find saved credentials at path: '{self.__creds_filepath}'")

        with open(self.__creds_filepath, 'r') as file:
            cookies = json.loads(file.read())

            self.__cookiejar = self.__create_cookiejar(
                cookies=cookies)

            self.__cookies = cookies

        return True

    def login(
        self,
        max_wait_time: int,
    ) -> bool:
        '''
        If `use_stored_credentials` is enabled:
        Attempt to fetch credentials from credential filepath and
        if the file does not exist then login using the webdriver
        and store the captured cookies

        If `use_stored_credentials` is not enabled::
        Login using a headless webdriver then capture driver cookies 
        in a `requests` session cookiejar

        `max_wait_time` : max time to wait for the driver to
        return a response
        '''

        logger.info('Starting webdriver login')

        try:
            self.__login(
                max_wait_time=max_wait_time)

            return True

        except TimeoutException as ex:
            self.__driver.save_screenshot('timeout.png')
            logger.exception(
                msg='Webdriver timeout exception')

            raise Exception(
                f'Login timed out, consider extending maximum wait times')
goliath888 commented 1 year ago

Hi @danleonard-nj, I am trying your solution, but I am having problems with the Cookies message "Your Choices Regarding Cookies on this Site" and I neet to click in the agree button, but I cannot achieve that. How did you pass that popup? It is the one with the class="truste_overlay"

Error while running the script: Message: element click intercepted: Element <button class="MuiButton-root MuiButton-contained MuiButton-containedPrimary MuiButton-sizeMedium MuiButton-containedSizeMedium MuiButton-fullWidth MuiButtonBase-root css-1hgit5d" tabindex="0" type="submit">...</button> is not clickable at point (960, 461). Other element would receive the click: <div id="pop-div06075297644599102" class="truste_overlay"></div>

danleonard-nj commented 1 year ago

@goliath888 I was able to get past that by maximizing the webdriver and setting a larger window size.

goliath888 commented 1 year ago

@goliath888 I was able to get past that by maximizing the webdriver and setting a larger window size.

Tried with driver.set_window_size(2560, 1440) and driver.set_window_size(3840, 2160), but the problem remains.

I think that the div only goes away If you click on the agree button.

schambersnh commented 1 year ago

@goliath888 any luck here? I'm running into the same "is not clickable" exception.

@danleonard-nj what was the exact window size you selected?

schambersnh commented 1 year ago

I solved this. MFP added a cookie iframe at the bottom. Simply had to modify the driver to switch to it, click it, and proceed with the rest of the code.

Adding here if its helpful.

`import json import logging import os import time from typing import Tuple

import requests from requests.cookies import RequestsCookieJar from selenium.common.exceptions import TimeoutException from selenium.webdriver.remote.webelement import WebElement from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait from seleniumwire import webdriver

logger = logging.getLogger()

def first(items, func): for item in items: if func(item): return item return None

def any(items, func): for item in items: if func(item): return True return False

def foreach(items, func): for item in items: func(item)

def empty(items): if items is not None and len(items) > 0: return True return False

wire_logger = logging.getLogger( 'seleniumwire.handler').setLevel(level=logging.WARNING)

class SeleniumAuthConstants: LOGIN_URL = 'https://www.myfitnesspal.com/account/login' LOGGER_NAME = 'seleniumauth' TERMINAL_REQUEST = 'featured_blog_posts'

class Element: EMAIL_NAME = 'email' PASSWORD_NAME = 'password' ACCEPT_COOKIES_XPATH = "//button[@title='ACCEPT']" COOKIE_IFRAME_XPATH="//iframe[@title='SP Consent Message']" SUBMIT_XPATH = "//button[@type='submit']" TOOLBAR_XPATH = "//span[contains(@class,'MuiContainer-maxWidthLg')]"

class SeleniumAuth: @property def cookies( self ) -> list[dict]: ''' The raw cookie data returned from the webdriver requests or stored credentials '''

    return self.__cookies

@property
def cookiejar(
    self
) -> RequestsCookieJar:
    '''
    A `RequestsCookieJar` with the cookies captured 
    from the webddriver or from loaded credentials.  
    '''

    return self.__cookiejar

def __init__(
    self,
    username: str,
    password: str,
    webdriver_path: str,
    creds_filepath: str = None,
    use_stored_credentials: bool = True
):
    self.__username = username
    self.__password = password
    self.__webdriver_path = webdriver_path
    self.__creds_filepath = creds_filepath
    self.__use_stored_credentials = use_stored_credentials

    self.__driver: webdriver.Chrome = None

    self.__cookies = list()
    self.__cookiejar = None

def __initialize_driver(self) -> webdriver.Chrome:
    ''' Create and configure the Chrome webdriver '''

    options = webdriver.ChromeOptions()
    #options.headless = True
    options.add_argument("--start-maximized")
    options.add_argument("--kiosk")

    driver = webdriver.Chrome(
        self.__webdriver_path,
        options=options)

    driver.set_window_size(1920, 1080)
    driver.maximize_window()

    driver.set_page_load_timeout(60)

    print('driver initialized')
    self.__driver = driver

def __navigate(
    self
) -> None:
    '''
    Navigate to service login page
    '''

    self.__driver.get(
        url=SeleniumAuthConstants.LOGIN_URL)

def __get_login_input_elements(
    self
) -> Tuple[WebElement, WebElement]:
    ''' Get the login username and password elements '''

    print('Waiting for email element')
    email = self.__get_element(
        _type='name',
        selector=Element.EMAIL_NAME)

    print(f'Waiting for password elemenet')
    password = self.__get_element(
        _type='name',
        selector=Element.PASSWORD_NAME)

    return email, password

def __create_cookiejar(
    self,
    cookies=None
) -> Tuple[RequestsCookieJar, dict]:
    '''
    Create a cookiejar from the captured driver cookies and
    return the created cookie jar and the source dictionary

    '''

    raw_cookies = cookies or self.__driver.get_cookies()

    if self.__driver:
        self.__driver.quit()

    if not empty(raw_cookies or []):
        raise Exception('Failed to capture driver cookies')

    # Convert the dictionary of cookie values returned from
    # the driver or loaded from stored credentials into a
    # requests cookiejar (derived from http.CookieJar used
    # by MFP client)
    print(f'Creating requests session cookiejar')
    session = requests.session()

    foreach(raw_cookies, lambda cookie: session.cookies.set(
        name=cookie.get('name'),
        value=cookie.get('value'))
    )

    self.__cookiejar = session.cookies
    self.__cookies = raw_cookies

    if (self.__use_stored_credentials
            and self.__creds_filepath is not None):

        self.__save_credentials()

def __get_element(
    self,
    _type: str,
    selector: str,
    max_wait_time=30
) -> WebElement:
    '''
    Explicitly wait for an element to become available
    on the page and return the element

    `type`: element type
    `value`: element path
    `max_wait_time`: maximum time to wait before timeout
    '''
    print(f'getting element')

    locator = (_type, selector)

    element = WebDriverWait(
        driver=self.__driver,
        timeout=max_wait_time).until(
            method=EC.presence_of_element_located((
                locator)))

    return element

def __get_submit(
    self
) -> WebElement:
    ''' Get the login page submit button element '''

    print(f'Locating submit element')
    submit = self.__get_element(
        _type='xpath',
        selector=Element.SUBMIT_XPATH)

    print(f'Submit element: {submit}')

    if isinstance(submit, list):
        return submit[0]

    return submit

def __get_accept(
    self
) -> WebElement:
    ''' Get the login page submit button element '''

    print(f'Locating accept element')
    accept = self.__get_element(
        _type='xpath',
        selector=Element.ACCEPT_COOKIES_XPATH)

    print(f'Accept element: {accept}')

    if isinstance(accept, list):
        return accept[0]

    return accept

def __get_cookie_iframe(
    self
) -> WebElement:
    ''' Get the login page submit button element '''

    print(f'Locating cookie iframe element')
    iframe = self.__get_element(
        _type='xpath',
        selector=Element.COOKIE_IFRAME_XPATH)

    print(f'Iframe element: {iframe}')

    if isinstance(iframe, list):
        return iframe[0]

    return iframe

def __get_distinct_request_urls(
    self
) -> list[str]:
    '''
    Get a list of all distinct request URLs from
    the webdriver
    '''

    urls = set([
        req.url for req in self.__driver.requests
    ])

    return list(urls)

def __wait_for_login(
    self,
    max_wait_time: int
) -> None:
    '''
    Wait for the login to complete indicated by
    the toolbar element becoming available on the
    page before exiting.  Otherwise seleniumwire
    may not capture the full network trace for the
    login and the required cookies

    `max_wait_time` : maximum time to wait on the
    toolbar element, this may
    '''

    # Use the XHR request that fetches the spash page's blog
    # post list as an idicator that we've captured all the
    # auth-related requests.  Otherwise the driver may exit
    # before this happens

    def terminal_request_receieved(driver: webdriver.Chrome):
        distinct_urls = self.__get_distinct_request_urls()

        if any(items=distinct_urls,
               func=lambda url: SeleniumAuthConstants.TERMINAL_REQUEST in url):
            print(f'Terminal request received')
            return True

    WebDriverWait(
        driver=self.__driver,
        timeout=max_wait_time).until(
            method=terminal_request_receieved)

    print(f'Terminal login request receieved')

def __login(
    self,
    max_wait_time: int
) -> RequestsCookieJar:
    '''
    Internal login routine, to be wrapped in try/except
    to capture and return specific exceptions concerning
    timeouts
    '''

    # Attempt to load stored credentials and use those if
    # available
    if (self.__use_stored_credentials
            and os.path.exists(self.__creds_filepath)):

        print('Using stored credentials')
        return self.load_credentials()

    print('No stored credentials found, using webdriver')

    self.__initialize_driver()
    self.__navigate()

    cookie_iframe = self.__get_cookie_iframe()

    #switch to the iframe
    self.__driver.switch_to.frame(cookie_iframe)

    print('switched to iframe')

    #click the accept button
    accept = self.__get_accept()
    accept.click()

    #switch back 
    self.__driver.switch_to.default_content()
    print('switched to default content')

    #login to myfitness pal
    usr, pwd = self.__get_login_input_elements()

    usr.send_keys(self.__username)
    pwd.send_keys(self.__password)

    submit = self.__get_submit()
    submit.click()

    self.__wait_for_login(
        max_wait_time=max_wait_time)

    self.__create_cookiejar()

def __save_credentials(
        self
) -> None:
    '''
    Export and save a JSON file containing the raw
    cookie data
    '''

    with open(self.__creds_filepath, 'w') as file:
        file.write(json.dumps(self.__cookies))

def load_credentials(
    self
) -> bool:
    '''
    Load credentials from the provided filepath if
    the file exists
    '''

    if not os.path.exists(self.__creds_filepath):
        raise Exception(
            f"Could not find saved credentials at path: '{self.__creds_filepath}'")

    with open(self.__creds_filepath, 'r') as file:
        cookies = json.loads(file.read())

        self.__cookiejar = self.__create_cookiejar(
            cookies=cookies)

        self.__cookies = cookies

    return True

def login(
    self,
    max_wait_time: int,
) -> bool:
    '''
    If `use_stored_credentials` is enabled:
    Attempt to fetch credentials from credential filepath and
    if the file does not exist then login using the webdriver
    and store the captured cookies

    If `use_stored_credentials` is not enabled::
    Login using a headless webdriver then capture driver cookies 
    in a `requests` session cookiejar

    `max_wait_time` : max time to wait for the driver to
    return a response
    '''

    print('starting webdriver login')

    try:
        self.__login(
            max_wait_time=max_wait_time)

        return True

    except TimeoutException as ex:
        self.__driver.save_screenshot('timeout.png')
        logger.exception(
            msg='Webdriver timeout exception')

        raise Exception(
            f'Login timed out, consider extending maximum wait times')

`

bhaktatejas922 commented 6 months ago

@schambersnh do you have a more updated version of this that works? couldnt get it to work, failing at multiple parts