appsembler / figures

Reporting and data retrieval app for Open edX
MIT License
44 stars 37 forks source link

Figures Questions (Candidates for updating docs or a FAQ) #237

Open johnbaldwin opened 4 years ago

johnbaldwin commented 4 years ago

In openedx slack #figures channel (slack link: https://openedx.slack.com/archives/CD0H6H8P5/p1594795591093400), I was asked:

  1. Can figures work without apsembler’s edx-platform fork? I mean with OpenEdx version of edx-platform?

  2. Is figures scaleable? If yes, then how much? Do we have any metric from stress test?

  3. Do we need intermediary cloud data warehouse in future to keep it working for huge pool of users?

  4. Can figures work without apsembler’s edx-platform fork? I mean with OpenEdx version of edx-platform?

Figures should be able to work on the upstream/community (Open edX) version of Figures Hawthorn (as of July 2020). If Figures does not work with upstream/community Hawthorn release of Open edX, please open a ticket in the GitHub issues (here: https://github.com/appsembler/figures/issues) and ping John in openedx slack #figuers channel.

For the releases of Open edX supported: We know we need to get Figures upgraded to work on Juniper, but I have other work that I need to address for our customers first, primarily with improving the API, metrics served and response performance.

It is important to note that multisite support for Figures has been coded to use Appsembler's fork

  1. Is figures scaleable? If yes, then how much? Do we have any metric from stress test?

Figures scalability is directly dependent on the scalability of the LMS infrastructure running it. Figures is designed to use the existing LMS infrastructure (MySQL, Celery for data processing async jobs, like the daily metrics extraction and aggregation, and the capability of the server hosting the app server).

With that, one of my main focuses right now are performance improvements to make figure more performant, such as making API queries (Django QuerySet queries) more efficient, aggregating data to reduce the need for live queries on built-in LMS models (like courseware.models.StudentModule), and scaling out the pipeline Celery tasks while ensuring resiliency to handle failed Celery tasks.

"Stress test metrics" nothing formal. As Appsembler's Tahoe data grow, we find API performance issues and address them. So our stress testing is using Figures in production. I don't have any formal metrics I can release at this time.

I am working piece by piece on building a development environment that can do stress testing with synthetic data. I just released an early version of Celery on Docker for Figure development environment "devsite":

https://github.com/appsembler/figures/blob/master/devsite/README.md

In the backlog is getting a MySQL docker container option implemented, then incrementally improve synthetic data generation.

  1. Do we need intermediary cloud data warehouse in future to keep it working for huge pool of users?

This is an open question. Appsembler is committed to enabling Figures to work on small deployments (which it does now) for the members of the community who need to deploy standalone servers. As I mentioned above, Figures uses available LMS resources. This helps Figures scale as its underlying infrastructure scales. We are also committed to our customers, which means that Figures also has to scale to meet our customer needs in a multi-tenant platform

iam-mhaseeb commented 4 years ago

Thanks @johnbaldwin for answering questions.

asadmanzoor93 commented 4 years ago

Thanks @johnbaldwin for detailed answers.