Closed letrain02 closed 2 years ago
nice. I'm just about to merge an update that adds some stat data. But not specifically what you mentioned here. I'll come back to these in the future.
Additionally, just an overall statistic of "Here's how much disk space Unmanic has saved you so far" would be great. This is a fantastic project. Thanks so much for your work so far.
@Apocrathia Depending on the users history, that could be a lot of data to parse to calculate. i m thinking of people who have converted thousands of files... That would be thousands of stats to parse and add every time a page is loaded. It may be fine at the start, but eventually it would become a slow page load. Perhaps we could have a separate "reports" page or the ability to export a report that contains totals like that?
How are the statistics being currently stored? I've noticed that there is no SQL backend as I'm trying to dig through the code and learn how the program is setup. This could probably be achieved in a simple SQLite query that only executes once a job has finished, and the value is stored in a statistics table. That way, it isn't being calculated every single time that the user goes to the history page. It would get ridiculously slow if the application tried to calculate that value every time.
Edit: Just found where the JSON files are getting stored. You've already got this pretty well organized. This could pretty easily be translated into a relational structure that could be used throughout the application. I think a SQLite backend would be a different issue, though. Let me play around with the idea in a fork and see if I can come up with anything useful. Thanks for your input!
@Apocrathia I think you are right. Initially I thought that we would only need x2 JSON dumps to handle all data storage. But it looks like this is growing and the application could benefit from migrating everything now to SQLite. I was thinking of adding something like https://github.com/coleifer/peewee and breaking out the history and settings data from the config.py file to their own model classes. What do you think?
@Josh5 If you do the settings data, it still needs an initial pull from a file upon load, and only using the db for quicker access to settings. This is why a lot of people still use .ini format settings files. It's always good to be able to change something manually when things go fucky. Utilizing and ORM would abstract a lot of the actual SQL from the code, enabling the user to select which database system they want to use. SQLite by default, but if you have a library hundreds of thousands of files, you could point it to a MySQL/PostgreSQL instance. That could easily be setup as a Docker Compose example. I actually haven't seen peewee before. That's a great way to keep everything streamlined without starting to worry too much about SQL optimization. It just depends on how well that scales. Keeping the database calls as Python also makes it more accessible to other contributors who may not be as comfortable with SQL. You're also mitigating any potential SQL injection vulnerabilities by using the ORM.
@Apocrathia All very good points. I've bumped it up to the top priority before moving onto expanding the applications settings. I will push something later on today
@Apocrathia
DB storage support has been pushed onto branch dev-settings
. I'll comeback to moving history logging to the database once the other milestone is complete.
I'm removing this from the Migrate historical data to SQLite project for now. It is not complete and I want to come back to this at a later date. For now the stuff in that project is complete and I want to merge it in and start working on hardware encoding. There are a bunch of other features that need to be added to the UI like deleting old historical records, filtering, etc. I might add a project for that sort of thing and add this to it.
Some statistics on counts. How many have been encoded, how many are left. Beyond that maybe some statistics on estimated time based on previous times.