ethanopp / fitly

Self hosted web analytics for endurance athletes
MIT License
184 stars 24 forks source link

Where do we use df_samples? #9

Open pierretamisier opened 4 years ago

pierretamisier commented 4 years ago

I want to import all my strava data into the tool (1500+ activities since 2007) so I'm trying to import a csv extract from http://flink.run

In the datapull.refresh_database function, I replace the strava API call:

activities = client.get_activities(after=after, limit=0)

By this snippet:

if strava_connected():
                            athlete_id = 1  # TODO: Make this dynamic if ever expanding to more users
                            client = get_strava_client()
                            after = config.get('strava', 'activities_after_date')

                            activities = []

                            with open('strava.csv', newline='') as strava_csv:
                                reader = csv.DictReader(strava_csv)
                                for row in reader:
                                    act = stravalib.model.Activity(
                                        name=row['name'],
                                        distance=float(row['distance']) if row['distance'] else 0,
                                        moving_time=timedelta(int(row['moving_time'])) if row['moving_time'] else 0,
                                        elapsed_time=int(row['elapsed_time']) if row['elapsed_time'] else 0,
                                        total_elevation_gain=float(row['total_elevation_gain']) if row['total_elevation_gain'] else 0,
                                        type=row['type'],
                                        workout_type=row['workout_type'],
                                        id=row['id'],
                                        ...etc...
                                        activities.append(act)

Now the problem is downstream, the df_samples make an API call to get the streams of activity. I'm still not clear of what these df_samples are used for. Can someone post a screenshot of where they are used in the UI?

I'm trying to think of ways I can import archive data without blowing up the API calls rate limit (100 per 15min). What am I loosing if I don't have any df_samples in the tool? What are they used for in terms of KPIs?

I believe the strength of any charting tool resides in the ability to draw models on past data.

ethanopp commented 4 years ago

The samples are what is used for everything really... The entire PMC (calculating stress scores) and performance dashboard are based off of that data.

It may be better for us to just work some logic into the code that is mindful of the strava api limitations.

Could you pull the latest commit and run a truncate all with your debug log enabled?

I would like to see where you are getting errors from the strava side, and then can work in some rest timers to avoid hitting the API limits should they be needed. I'm using the stravalib library which should have some rate-limit detection built in: https://pythonhosted.org/stravalib/_modules/stravalib/exc.html#RateLimitExceeded

I've also just worked in some caching for Peloton and Stryd as to not overload their APIs, so maybe we could do something similar for strava as well

Here's an example of one workout on the performance dashboard: image

And of course the performance dashboard itself: image

pierretamisier commented 4 years ago

It may be better for us to just work some logic into the code that is mindful of the strava api limitations.

I fully agree with this. I did log all the API calls yesterday while debugging. After hitting 99 API calls, most of them due to getting the streams (streams = get_strava_client().get_activity_streams(self.id, types=types)), Unfortunately I cant remember seeing anything special in debug as nor in the console (sorry cant reproduce the issue atm as I cant build the app since the latest commits, I raised another issue). No particular exception raised/displayed. I logged into the Strava portal to realize that I reached the API call rate limit.

I'll think of ways we can get the API to run as per the limitations and keeping track of the status of the import.