hunghvu / hungvu.tech

The official repository for hungvu.tech
https://hungvu.tech
3 stars 0 forks source link

Create a table to "pull and display" OpenWRT table of hardware #112

Closed hunghvu closed 6 months ago

hunghvu commented 8 months ago

Should be a fun one to do. Essentially, the OpenWRT ToH page is too slow and is usually out of date. We can try implementing a performant and better UX ToH as a homelab side project. OpenWRT wiki content is licensed under CC BY-SA 4.0, so this should be a viable project.

We might start working on the CSV export of ToH:

This is a big ticket.


Note: Next time, consider a big ticket like this as a milestone, not just an issue.

hunghvu commented 8 months ago

Thought flow:

hunghvu commented 8 months ago

This is a good thread to go over for ideas: https://forum.openwrt.org/t/improving-the-table-of-hardware-toh/139259

hunghvu commented 8 months ago

How to call payload?

Also, how do we implement exponential backoff for failed jobs within an interval of 24h?

hunghvu commented 8 months ago

When a table has 2000+ rows, what should be a preferable approach to perform an update?

  1. Should we just delete the whole collection and re-import every day? This is the most simple but is a resource-consumption task, and can potentially crash the virtual machine as seen in #79.
  2. Selective update? The current dataset is 2000+ rows and 70+ columns, so >140k cells.
    • If we iterate the dataset then only fetch a row per pid, this reduces the request/response size, ensuring no issue to the virtual machine. However, this means we have at least 2000+ requests to make. Is the network overhead negligible?
    • If we fetch all the datasets for comparison, the request/response size can be unexpectedly large and potentially crash the VM. However, we reduce the network overhead.

Need to think more about these trade offs.

hunghvu commented 8 months ago

The current implementation hits MongoDB (local test) pretty hard.

image

hunghvu commented 8 months ago

Actually, it is a bug we did not await the patch request to MongoDB. Hence, a flush of requests hammers the database. When we ran ESlint in a5e6516179e2b32fb658fce336a9ee17508bd6c4, the bug was accidentally discovered. The graph below shows database performance after the we await our requests.

image

hunghvu commented 7 months ago

For the front end:

For the back end:

hunghvu commented 6 months ago

When doing a filter, the current approach is like this.

  1. Filter
  2. Fetch new data
  3. Show all filtered result

The problem is, that when the new filtered dataset is returned, it resets all choices for filtering because the choices are derived from the dataset. How to maintain the choices?

When the choices are maintained, it means whichever new result is returned from the backend won't get reflected.

Or perhaps, should we avoid filtering on click? We can define the filter set and then apply it afterward. To make a new query, users need to clear the existing filter. This behaves resembles Excel.

Or, we simply deduplicate the generation of filter choices, meaning it always requests for a dedicated dataset per table refresh.

hunghvu commented 6 months ago

Or, should we skip fetching new data, and just need to filter what is on the current page? This does not look like what users will expect. This means filtering is confined to data per page only.

On the other hand, if we have a dedicated request to get options for multi-select, perhaps we can have a dedicated endpoint for this purpose.

hunghvu commented 6 months ago

Done with b7089566fefcd1e75fa66e3f6cf5cb05b32fc688.