InvisiblePlatform / rosetta

Invisible Voice Version 3.0 The Reckoning
0 stars 0 forks source link

Performance Enhancing #23

Open ixt opened 2 years ago

ixt commented 2 years ago

The scripts taken from old IV for updating some databases are slow, We can do better. A generic function needs to be written to apply to any given large process so that we can cut down on some of this time taken.

ixt commented 2 years ago

Build also takes a while, 20min~ currently with 61k~ companies https://gohugo.io/troubleshooting/build-performance/ May roll this out into a milestone and attach tickets to it

ixt commented 2 years ago

Template Metrics:

     cumulative       average       maximum         
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
  2h32m40.719219345s  149.479786ms  4.422877594s  61284  _default/single.html
  2h29m56.654756681s  172.016878ms  774.476185ms  52301  shortcodes/wikidata.html
  2m26.365946101s   33.002468ms  382.353172ms   4435  shortcodes/mbfc.html
   8.348924513s     136.233µs    9.102766ms  61284  _default/single.json.json
   7.541875447s     123.064µs    9.091853ms  61284  _default/item.json.json
   2.330389807s     917.837µs    25.47144ms   2539  shortcodes/goodonyou.html
   2.238033159s     594.115µs   29.748518ms   3767  shortcodes/bcorp.html
   875.964247ms  875.964247ms  875.964247ms      1  index.html
   776.248116ms  776.248116ms  776.248116ms      1  _internal/_default/sitemap.xml
    89.214056ms   89.214056ms   89.214056ms      1  list.json
     2.201241ms       1.217µs      52.019µs   1808  shortcodes/glassdoor.html
      573.665µs     286.832µs     300.336µs      2  _internal/_default/rss.xml
      353.222µs     176.611µs     247.432µs      2  _default/list.html
      119.038µs     119.038µs     119.038µs      1  404.html
      102.842µs       34.28µs      99.143µs      3  partials/header.html
       85.133µs      28.377µs      81.964µs      3  partials/footer.html
ixt commented 2 years ago
Template Metrics:

     cumulative       average       maximum      cache  percent  cached  total

       duration      duration      duration  potential   cached   count  count  template

     ----------      --------      --------  ---------  -------  ------  -----  --------

  36m46.288933814s   35.999297ms  2.182343548s          0        0       0  61287  _default/single.html

  35m27.420130624s   40.671805ms  1.670070498s          0        0       0  52307  shortcodes/wikidata.html 

  38.549340986s     482.318µs   74.961531ms          0        0       0  79925  shortcodes/wikipedia.html

  29.320595732s   30.638031ms  777.566366ms          0        0       0    957  shortcodes/mbfc.html

  10.124629668s       165.2µs   19.219611ms          0        0       0  61287  _default/single.json.json

   9.038566545s     147.479µs   19.207043ms          0        0       0  61287  _default/item.json.json

   967.049743ms  967.049743ms  967.049743ms          0        0       0      1  index.html

   808.137155ms  808.137155ms  808.137155ms          0        0       0      1  list.json

   710.587553ms  710.587553ms  710.587553ms          0        0       0      1  _internal/_default/sitemap.xml

   240.628887ms     759.081µs   21.083936ms          0        0       0    317  shortcodes/goodonyou.html

   117.262296ms     102.502µs    8.649381ms          0        0       0   1144  shortcodes/glassdoor.html

    60.615176ms     492.806µs    1.212688ms          0        0       0    123  shortcodes/bcorp.html

       76.315µs      76.315µs      76.315µs        100        0       0      1  partials/header.html

       48.791µs      48.791µs      48.791µs        100        0       0      1  partials/footer.html

       41.322µs      41.322µs      41.322µs          0        0       0      1  404.html

Total in 282976 ms

more gains

ixt commented 2 years ago

Last couple of days have proven issues with local development machine not having enough resources. Expansion of SWAP is a solution for mid-RAM machines for rendering the site in hugo which now takes over 32gb of RAM at its peak, its unclear as I am writing this what the requirements are now we are in the 75k companies zone with the 3-layer deep wikipedia/wikidata associations. This is more likely related to the associations rather than the 30% increase in companies.

Local cache of JSON requests is closing in on 70k, and rough estimates is that this may double. There's gonna be turbulence kids!

ixt commented 2 years ago

     cumulative       average       maximum      cache  percent  cached  total  

       duration      duration      duration  potential   cached   count  count  template

     ----------      --------      --------  ---------  -------  ------  -----  --------

  1h28m58.853124078s   72.914233ms  33.717832408s          0        0       0  73221  _default/single.html

  48m13.830595945s   44.970871ms  5.142949473s          0        0       0  64349  shortcodes/wikidata.html

  33m22.422788396s    8.830231ms  33.68921873s          0        0       0  226769  shortcodes/wikipedia.html

  1m5.073124397s     888.722µs  509.323188ms         31        0       0  73221  partials/similar.html

  15.627288786s   21.089458ms  925.213586ms          0        0       0    741  shortcodes/isin.html

   6.709231793s    2.517535ms  329.599932ms          0        0       0   2665  shortcodes/goodonyou.html

    6.21385977s    1.673992ms  116.356577ms          0        0       0   3712  shortcodes/bcorp.html

   4.013184631s  4.013184631s  4.013184631s          0        0       0      1  index.html

   3.797138332s    1.884435ms  107.030338ms          0        0       0   2015  shortcodes/glassdoor.html

   3.164952574s  3.164952574s  3.164952574s          0        0       0      1  _internal/_default/sitemap.xml

   1.516108155s  1.516108155s  1.516108155s          0        0       0      1  list.json

   223.185039ms  111.592519ms  222.963809ms          0        0       0      2  _default/list.html

    60.051888ms       1.193µs    4.734845ms          0        0       0  50313  shortcodes/trust.html

    29.175557ms   14.587778ms   17.479311ms          0        0       0      2  _internal/_default/rss.xml

      215.029µs      71.676µs     210.719µs        100        0       0      3  partials/header.html

       61.685µs      20.561µs       58.58µs        100        0       0      3  partials/footer.html

       60.794µs      60.794µs      60.794µs          0        0       0      1  404.html

                   |  EN    

-------------------+--------

  Pages            | 73227  

  Paginator pages  |     0  

  Non-page files   |     0  

  Static files     |    11  

  Processed images |     0  

  Aliases          |     0  

  Sitemaps         |     1  

  Cleaned          |     0  

Built in 712582 ms
ixt commented 2 years ago

In the template generation code I've been thinking of ways we can further speed up the process since this is the slowest part of the process. We already parallelise most of the process and still the process takes hours. This should really not be the case. I think we are probably hitting file-access speed limits. Perhaps moving the wikidata caches to a ramdisk would speed things up.

ixt commented 2 years ago

Limited run (indexes: Goodonyou, mbfc, bcorp, glassdoor) Timing for limited run without changes:

real    16m4.411s
user    108m21.644s
sys     10m50.220s

Just wikicache in ram:

real    8m39.988s
user    32m41.911s
sys     11m2.001s

Wikicache + Wikidata index in ram + avoiding wikipage lookups if known bad:

real    16m14.008s
user    102m35.283s
sys     18m8.402s

Reengineered the associated_companies function to work more via associations + above:

real    3m42.970s
user    21m43.202s
sys     6m21.180s

Full regeneration with new optimisations are estimated around 2 hours with the new optimisations

ixt commented 2 years ago

Generation:

real    144m48.294s
user    917m11.294s
sys     225m40.164s

2.5~ hours, 72562 pages.

ixt commented 2 years ago

                   |  EN    
-------------------+--------
  Pages            | 72567  
  Paginator pages  |     0  
  Non-page files   |     0  
  Static files     |    11  
  Processed images |     0  
  Aliases          |     0  
  Sitemaps         |     1  
  Cleaned          |     0  

Total in 512401 ms

real    8m34.018s
user    59m41.124s
sys     1m15.664s
ixt commented 2 years ago
Template Metrics:

     cumulative       average       maximum      cache  percent  cached  total
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  1h44m6.231974381s   22.076951ms  36.10302944s          0        0       0  282930  _default/single.html
  59m51.147158496s   55.764886ms  3.357594876s          0        0       0  64398  shortcodes/wikidata.html
  33m0.7615263s    5.136776ms  3.107983725s          0        0       0  385604  shortcodes/wikipedia.html
  2m50.403177544s   38.422362ms  1.787263179s         84        0       0   4435  partials/mbfc.html
  1m14.390977203s      262.93µs    875.8224ms         82        0       0  282930  partials/similar.html
  55.476409452s     196.078µs  576.416125ms         95        0       0  282930  partials/trustpilot.html
  55.416964002s  55.416964002s  55.416964002s          0        0       0      1  index.html
  24.642992795s  24.642992795s  24.642992795s          0        0       0      1  _internal/_default/sitemap.xml
  19.419809363s   26.207569ms  408.476996ms          0        0       0    741  shortcodes/isin.html
   17.51518555s  17.51518555s  17.51518555s          0        0       0      1  list.json
    8.63296202s    3.064594ms  400.415609ms         12        0       0   2817  partials/goodonyou.html
   7.229909967s    1.131264ms   88.693449ms          0        0       0   6391  shortcodes/glassdoor.html
   6.816600941s    1.766416ms  114.798836ms         70        0       0   3859  partials/bcorp.html
    484.07015ms  242.035075ms  304.517501ms          0        0       0      2  _default/list.html
   272.918058ms  136.459029ms  215.955115ms          0        0       0      2  _internal/_default/rss.xml
    15.419874ms      10.151µs     151.404µs         71        0       0   1519  partials/tosdr.html
    554.183µs     184.727µs     550.679µs        100        0       0      3  partials/header.html
 173.334µs     173.334µs     173.334µs          0        0       0      1  404.html
94.472µs       31.49µs      82.851µs        100        0       0      3  partials/footer.html

                   |   EN
-------------------+---------
  Pages            | 282936
  Paginator pages  |      0
  Non-page files   |      0
  Static files     |     11
  Processed images |      0
  Aliases          |      0
  Sitemaps         |      1
  Cleaned          |      0

Built in 1016131 ms
ixt commented 2 years ago

Its worth noting that the extension now loads faster due to filesize adjustments and the way we load data in the newest format

ixt commented 2 years ago

With the removal of tags from the frontend the site generation is very fast now, as well as some novel partial-caching for many components in the site.

Build times still suck on making records, 6 hours + for the non-wikipedia db, this should improve dramatically post #64

Page loading times is likely the next frontier as they are getting pretty large. Perhaps lazy-loading the wikipedia sections might be good to do when redesign is implemented.

ixt commented 2 years ago
Template Metrics:

     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  14m37.595504124s    7.524095ms  2.554144783s          0        0       0  116638  _default/single.html
  6m5.964251761s    1.171426ms  2.438299011s          0        0       0  312409  shortcodes/wikipedia.html
  1m54.810978971s   24.130092ms  1.520782643s         78        0       0   4758  partials/mbfc.html
  1m44.81354175s     898.622µs  1.919712924s         44        0       0  116638  partials/similar.html
  1m26.96919481s     275.312µs  1.479860378s          0       89  282322  315893  partials/wikidata-name.html
  1m18.76639665s     675.306µs   339.28632ms         84        0       0  116638  partials/trustpilot.html
  12.310111067s    6.097132ms  2.197388928s         64        0       0   2019  partials/wikidata.html
  10.396321988s   13.589963ms   48.705537ms          0        0       0    765  shortcodes/isin.html
   5.462238553s     675.768µs   47.479746ms         50        6     514   8083  partials/glassdoor.html
   5.302672015s    1.370199ms   31.918444ms         70        0      12   3870  partials/bcorp.html
   4.593098407s    1.514374ms   17.013714ms         58        3      77   3033  partials/goodonyou.html
   2.137098218s   89.045759ms  656.586977ms          0        0       0     24  index.html
   2.098289527s  2.098289527s  2.098289527s          0        0       0      1  list.json
   1.349149817s  1.349149817s  1.349149817s          0        0       0      1  _internal/_default/sitemap.xml
    932.89521ms     146.589µs   23.954095ms         87       15     946   6364  partials/yahoo.html
   341.846221ms   170.92311ms  172.913986ms          0        0       0      2  _default/list.html
   211.835908ms       5.245µs    1.848574ms         99        0       0  40383  partials/d3-graph.html
    66.206637ms      35.404µs   17.098951ms         71       18     338   1870  partials/tosdr.html
    25.798182ms   12.899091ms   12.938325ms          0        0       0      2  _internal/_default/rss.xml
     6.078015ms      253.25µs    2.812375ms         10        0       0     24  partials/inline/pagination/default
      169.599µs     169.599µs     169.599µs          0        0       0      1  _internal/alias.html
      123.934µs     123.934µs     123.934µs          0        0       0      1  partials/header.html
       85.186µs      85.186µs      85.186µs          0        0       0      1  partials/footer.html
       77.238µs      77.238µs      77.238µs          0        0       0      1  404.html
       76.067µs       3.042µs       4.932µs        100      100      25     25  partials/header
       69.706µs       2.788µs       3.964µs        100      100      25     25  partials/footer

                   |   EN    
-------------------+---------
  Pages            | 116644  
  Paginator pages  |     23  
  Non-page files   |      0  
  Static files     |  40394  
  Processed images |      0  
  Aliases          |      1  
  Sitemaps         |      1  
  Cleaned          |      0  

Built in 174528 ms
ixt commented 2 years ago

https://numba.pydata.org/numba-doc/dev/user/5minguide.html#other-things-of-interest https://www.youtube.com/watch?v=x58W9A2lnQc

ixt commented 2 years ago

real   169m43.761s
user   1059m5.810s
sys    123m4.349s

record building, ramcache, new queue, 110k records

ixt commented 2 years ago
     cumulative       average       maximum      cache  percent  cached  total
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  20m0.339863008s    9.530894ms  487.531414ms          0        0       0  125942  _default/single.html
  11m58.543533029s    7.995276ms  103.086964ms         65        0       0  89871  partials/wikidata-social.html
  1m44.744356654s   22.588819ms   84.407186ms         84        0       0   4637  partials/mbfc.html
  31.502758266s     250.137µs    25.97967ms         41        0       0  125942  partials/similar.html
  13.099575902s    3.829165ms   42.822559ms         64        0       0   3421  partials/wikidata-link.html
  10.524146817s    4.455608ms  481.986695ms         64        0       0   2362  partials/wikidata.html
   5.849616047s      46.446µs   13.648799ms         85        0       0  125942  partials/trustpilot.html
    2.13679302s   2.13679302s   2.13679302s          0        0       0      1  list.json
   1.981989012s      94.791µs   26.074203ms         52        3     556  20909  partials/glassdoor.html
   1.604656045s     544.135µs    3.840288ms         58        1      20   2949  partials/goodonyou.html
   1.387999342s     359.213µs   16.670369ms         70        0       8   3864  partials/bcorp.html
   1.380791454s  1.380791454s  1.380791454s          0        0       0      1  _internal/_default/sitemap.xml
   1.023314489s      431.96µs  478.699958ms          0       85    2011   2369  partials/wikidata-name.html
   957.274079ms   36.818233ms  340.243969ms          0        0       0     26  index.html
   885.467862ms       8.609µs   20.110186ms         99        1     890  102843  partials/d3-graph.html
   327.188839ms      48.186µs   31.313934ms         86       20    1377   6790  partials/yahoo.html
   77.452473ms      32.958µs    4.823941ms         71       31     731   2350  partials/tosdr.html
   3.243395ms     124.745µs     249.907µs          9        0       0     26  partials/inline/pagination/default
      678.871µs     339.435µs     354.186µs          0        0       0      2  _internal/_default/rss.xml
      301.141µs      150.57µs     224.607µs          0        0       0      2  _default/list.html
   118.947µs     118.947µs     118.947µs          0        0       0      1  partials/header.html
110.162µs     110.162µs     110.162µs          0        0       0      1  _internal/alias.html
97.432µs      97.432µs      97.432µs          0        0       0      1  partials/footer.html
59.921µs      59.921µs      59.921µs          0        0       0      1  404.html
33.139µs       1.227µs       1.935µs        100      100      27     27  partials/header
32.122µs       1.189µs       2.944µs        100      100      27     27  partials/footer

                   |   EN
-------------------+---------
  Pages            | 125948
  Paginator pages  |     25
  Non-page files   |      0
  Static files     | 104734
  Processed images |      0
  Aliases          |      1
  Sitemaps         |      1
  Cleaned          |      0

Built in 164413 ms
ixt commented 1 year ago
real    5016m46.202s
user    29916m24.678s
sys 1732m54.162s
ixt commented 1 year ago
real    472m16.674s
user    1603m49.305s
sys 455m57.039s

After re-engineering a bunch of wikidata reading code, generation of records now takes less than 1/10th of the time to generate and ingests full wikidata backups.

(316k records)

ixt commented 1 year ago
real    124m46.140s
user    485m4.615s
sys 183m22.453s

re-engineering pretty much everything to remove as much unneeded operations and reads as possible, leaning heavily on indexing and lookup tables as associative arrays has cut down the build back down to iterate-able levels. 316k records in a little over 2 hours. This part of the code base is likely as fast as it can be without shifting paradigm or changing hardware again.

ixt commented 1 year ago
     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  25m21.019823717s    4.811572ms  366.693925ms          0        0       0  316117  _default/single.html
  12m59.660282165s    8.390842ms   86.289357ms         38       11   10475  92918  partials/wikidata-social.html
  3m26.105569691s     651.991µs   57.942111ms         69        0       0  316117  partials/similar.html
  2m15.019299076s     427.118µs   45.087759ms         95        0       0  316117  partials/trustpilot.html
  1m53.29823693s   25.511875ms  227.149487ms         85        0       0   4441  partials/mbfc.html
  22.359678619s    9.201513ms   60.635664ms        100       17     411   2430  partials/tosdr.html
   14.15896314s     3.92977ms   24.592855ms         59       11     405   3603  partials/wikidata-link.html
  13.928343581s    5.398582ms   80.468952ms         48        0       0   2580  partials/wikidata.html
  11.743088686s   14.320839ms   77.393568ms         36       16     131    820  partials/isin.html
   7.422477677s     364.705µs   36.511276ms         65        0       1  20352  partials/glassdoor.html
    7.11963097s   7.11963097s   7.11963097s          0        0       0      1  list.json
   5.347148185s      16.915µs   20.819095ms        100      100  316116  316117  partials/translation_picker.html
   4.317184016s  4.317184016s  4.317184016s          0        0       0      1  _internal/_default/sitemap.xml
   3.977777733s    1.031581ms   53.467581ms         67        0       0   3856  partials/bcorp.html
   3.854055036s   60.219609ms  1.796487186s          0        0       0     64  index.html
   3.608306966s    1.169629ms   21.289083ms         90        0       0   3085  partials/goodonyou.html
   3.003740854s    1.156174ms   78.451766ms          1       85    2221   2598  partials/wikidata-name.html
   530.600809ms       5.171µs   21.958319ms         98        0       0  102603  partials/d3-graph.html
    52.154701ms    26.07735ms   26.611968ms          0        0       0      2  _default/list.html
     51.02682ms     797.294µs   42.196833ms          7        0       0     64  partials/inline/pagination/default
    37.274676ms   18.637338ms   18.655971ms          0        0       0      2  _internal/_default/rss.xml
     5.171894ms    5.171894ms    5.171894ms          0        0       0      1  _internal/alias.html
      306.786µs     153.393µs     160.361µs          0        0       0      2  partials/footer.html
      256.466µs       3.945µs     158.067µs        100      100      65     65  partials/footer
      229.878µs     229.878µs     229.878µs          0        0       0      1  404.html
      190.003µs       2.923µs       5.019µs        100      100      65     65  partials/header
      127.341µs       1.989µs      62.325µs        100       98      63     64  partials/logo.html
      117.811µs        1.84µs      79.505µs        100       98      63     64  partials/glossary.html
       70.268µs      70.268µs      70.268µs          0        0       0      1  partials/header.html
                   |   EN    
-------------------+---------
  Pages            | 316123  
  Paginator pages  |     63  
  Non-page files   |      0  
  Static files     | 298327  
  Processed images |      0  
  Aliases          |      1  
  Sitemaps         |      1  
  Cleaned          |      0  

Built in 289017 ms
ixt commented 1 year ago

I forgot to note here that it takes around an hour to rebuild the linking lists that use https-everywhere and duckduckgo's entity radar. This is single threaded but likely isnt something that needs to be made faster since its not as frequently updated at the source level nor does it change much (it just gets larger mostly)

ixt commented 1 year ago
real    124m59.235s
user    957m26.594s
sys 4m23.282s

Pythonic implementation, no graph, unoptimised, parallel

ixt commented 1 year ago
real    0m50.322s
user    3m17.463s
sys 0m46.631s

Pythonic implementation, no graph, minor optimisation, parallel (may be inaccurately fast for first run as caching at db might be happening)

ixt commented 1 year ago
real    385m18.072s
user    2996m15.378s
sys 6m12.448s

Pythonic, Graph and DB, First run, no optimisation, parallel

ixt commented 1 year ago
real    4m45.784s
user    20m11.702s
sys 1m42.426s

Pythonic, Graph and DB, Optimising of graph stuff, cutting out a lot of redundant checking with the way we filtered out claims that weren't available

ixt commented 11 months ago
real    6m44.221s
user    24m20.565s
sys     1m52.417s
ixt commented 11 months ago
real 7m0.162s
user 25m51.683s
sys  2m5.707s

Near Feature parity in new module based generation. Lots of places where optimisation can still be done regarding checking to see if data exists which could just be included in the indexes

ixt commented 10 months ago
real    7m7.406s
user    26m12.165s
sys     2m7.010s

Graph modifications, Even closer to parity. No new optimisations. Inter-db linking now possible.

ixt commented 10 months ago
     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  22m49.677435956s    3.956775ms  1.193946594s          0        0       0  346160  _default/single.html
  18m39.565677831s   11.174983ms  781.355323ms         58       10    9962  100185  partials/wikidata-social.html
  22.032729676s    4.808539ms   52.464289ms         49       17     775   4582  partials/wikidata-link.html
  19.399427213s       56.04µs   24.965738ms          0        0       0  346168  _default/single.json
  18.691428542s    8.162195ms   930.32279ms         32        0       0   2290  partials/wikidata.html
   8.909604884s  164.992683ms  7.272690426s          0        0       0     54  _internal/_default/rss.xml
   8.503096603s  8.503096603s  8.503096603s          0        0       0      1  list.json
   7.921953557s  146.702843ms  4.966374926s          0        0       0     54  _default/list.html
   7.793275908s      22.513µs   37.073551ms         87        0       0  346160  partials/tools.html
   5.868767324s  5.868767324s  5.868767324s          0        0       0      1  _internal/_default/sitemap.xml
   5.460243491s    2.384385ms  871.249547ms          4       16     372   2290  partials/wikidata-name.html
   3.140002697s        9.07µs   30.384571ms        100      100  346159  346160  partials/translation_picker.html
   712.772044ms       6.355µs   11.580729ms         98        0     414  112144  partials/d3-graph.html
   218.643849ms  218.643849ms  218.643849ms          0        0       0      1  index.html
    63.324116ms    7.915514ms    8.450334ms          0        0       0      8  _default/index.html
    50.226801ms   50.226801ms   50.226801ms          0        0       0      1  404.html
      927.894µs      14.966µs     217.898µs        100      100      62     62  partials/header
      224.297µs       3.617µs     101.454µs        100      100      62     62  partials/footer
       73.475µs      36.737µs      71.339µs          0        0       0      2  partials/header.html
       51.102µs      25.551µs      49.049µs          0        0       0      2  partials/footer.html

                   |   EN    
-------------------+---------
  Pages            | 692447  
  Paginator pages  |      0  
  Non-page files   |      0  
  Static files     | 322609  
  Processed images |      0  
  Aliases          |      0  
  Sitemaps         |      1  
  Cleaned          |      0  

Built in 364543 ms

In-progress NeoGraph web rendering. No optimisations.

ixt commented 10 months ago
real    4m19.559s
user    13m57.495s
sys     1m36.123s

Module parity, Neo-graph, New Wikidata placement. Small optimisations.

ixt commented 8 months ago

New hugo version:

Start building sites �
hugo v0.123.4-21a41003c4633b142ac565c52da22924dc30637a+extended linux/amd64 BuildDate=2024-02-26T16:33:05Z VendorInfo=snap:0.123.4

Template Metrics:

     cumulative       average       maximum      cache  percent  cached  total
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  2m6.10573053s  575.825253ms  41.299867513s          0        0       0    219  _internal/_default/rss.xml
  1m22.286453084s     241.351µs  132.944331ms          0        0       0  340940  _default/single.html
  33.811785145s      99.169µs  217.641271ms          0        0       0  340948  _default/single.json
  22.089406918s  100.864871ms  8.065513743s          0        0       0    219  _default/list.html
  14.280714141s  14.280714141s  14.280714141s          0        0       0      1  list.json
   9.182200684s      26.932µs  131.874073ms         87        0       0  340940  partials/tools.html
   8.042252143s  8.042252143s  8.042252143s          0        0       0      1  _internal/_default/sitemap.xml
    43.729268ms    5.466158ms   11.553188ms          0        0       0      8  _default/index.html
     7.772373ms    7.772373ms    7.772373ms          0        0       0      1  index.html
      597.254µs       2.642µs      13.751µs        100      100     226    226  partials/footer
   595.033µs       2.621µs      15.444µs        100      100     227    227  partials/header
287.242µs     143.621µs     273.149µs          0        0       0      2  partials/header.html
77.916µs      77.916µs      77.916µs          0        0       0      1  404.html
54.387µs      18.129µs      38.774µs          0        0       0      3  partials/footer.html

                   |   EN
-------------------+---------
  Pages            | 682338
  Paginator pages  |      0
  Non-page files   |      0
  Static files     | 335183
  Processed images |      0
  Aliases          |      0
  Cleaned          |      0

Built in 287164 ms
ixt commented 7 months ago
Template Metrics:

     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  17.939979424s      52.617µs   19.074009ms          0        0       0  340948  _default/single.json
  12.435516491s   41.729921ms  7.000073755s          0        0       0    298  _default/list.html
  10.100321255s      29.624µs   15.734765ms          0        0       0  340940  _default/single.html
    9.89668431s   9.89668431s   9.89668431s          0        0       0      1  list.json
     1.365321ms     170.665µs     224.194µs          0        0       0      8  _default/index.html
      861.987µs       2.807µs     165.334µs        100      100     307    307  partials/footer
      729.747µs       2.377µs     111.757µs        100      100     307    307  partials/header
      652.829µs     652.829µs     652.829µs          0        0       0      1  index.html
      149.542µs     149.542µs     149.542µs          0        0       0      1  404.html
       82.074µs      82.074µs      82.074µs          0        0       0      1  partials/footer.html
       56.252µs      56.252µs      56.252µs          0        0       0      1  partials/header.html

                   |   EN    
-------------------+---------
  Pages            | 682197  
  Paginator pages  |      0  
  Non-page files   |      0  
  Static files     | 335183  
  Processed images |      0  
  Aliases          |      0  
  Cleaned          |      0  

Total in 218110 ms

Removed a scratch and paired down one the single template

ixt commented 6 months ago
     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  19.558414639s      57.364µs   68.890234ms          0        0       0  340948  _default/single.json
  12.116188411s  12.116188411s  12.116188411s          0        0       0      1  list.json
  10.541611606s   35.374535ms  4.394736219s          0        0       0    298  _default/list.html
      731.054µs       2.444µs       7.215µs        100      100     299    299  partials/footer
      655.496µs       2.192µs      65.593µs        100      100     299    299  partials/header
      560.269µs     560.269µs     560.269µs          0        0       0      1  index.html
      100.602µs     100.602µs     100.602µs          0        0       0      1  404.html
       69.745µs      69.745µs      69.745µs          0        0       0      1  partials/header.html
        52.08µs       52.08µs       52.08µs          0        0       0      1  partials/footer.html

                   |   EN    
-------------------+---------
  Pages            | 341249  
  Paginator pages  |      0  
  Non-page files   |      0  
  Static files     | 335184  
  Processed images |      0  
  Aliases          |      0  
  Cleaned          |      0  

Total in 214393 ms

Removed single.json, some reworking of various parts. Closing in on using Hugo just for organising files.

ixt commented 6 months ago
Template Metrics:

     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
  18.301564449s      53.678µs   22.582681ms          0        0       0  340948  _default/single.json
   12.92894403s   43.385718ms  6.621970429s          0        0       0    298  _default/list.html
  10.001599633s  10.001599633s  10.001599633s          0        0       0      1  list.json
   397.848752ms       1.166µs     1.46608ms          0        0       0  340940  _default/single.html
     1.709728ms     213.716µs     266.647µs          0        0       0      8  _default/index.html
      718.723µs       2.341µs       10.98µs        100      100     307    307  partials/footer
      652.658µs       2.125µs      10.606µs        100      100     307    307  partials/header
      504.723µs     504.723µs     504.723µs          0        0       0      1  index.html
       86.973µs      86.973µs      86.973µs          0        0       0      1  404.html
       51.392µs      51.392µs      51.392µs          0        0       0      1  partials/header.html
       37.343µs      37.343µs      37.343µs          0        0       0      1  partials/footer.html

                   |   EN    
-------------------+---------
  Pages            | 682197  
  Paginator pages  |      0  
  Non-page files   |      0  
  Static files     | 335184  
  Processed images |      0  
  Aliases          |      0  
  Cleaned          |      0  

Total in 242435 ms

Disabling the page output kills the pages I wanted to keep (whoops). So I cant remove that until after switching to json output for rosetta/matching. Above is current state with the redundant pages left in.

ixt commented 6 months ago
                   |   EN    
-------------------+---------
  Pages            |     11  
  Paginator pages  |      0  
  Non-page files   |      0  
  Static files     | 676158  
  Processed images |      0  
  Aliases          |      0  
  Cleaned          |      0  

Total in 298579 ms

real    4m58.761s
user    1m15.872s
sys     1m20.760s

The change to premaking the json files has made build time slower for hugo, I didn't know that static file copying causes this amount of slowdown. The actual generating of pages now is basically instant with most of the time going to the data copying. We probably should break out the data storage or move it to roundabout. I am hesitant and will let it simmer before making any major changes. There may be some build settings that can fix some of this slowdown, with this amount of static files i'm guessing that a lot of cycles is spent on watching for changes in the ~700k files when running hugo server

ixt commented 6 months ago
     cumulative       average       maximum      cache  percent  cached  total  
       duration      duration      duration  potential   cached   count  count  template
     ----------      --------      --------  ---------  -------  ------  -----  --------
     1.374191ms     171.773µs     240.698µs          0        0       0      8  _default/index.html
      569.149µs     569.149µs     569.149µs          0        0       0      1  index.html
      478.926µs     478.926µs     478.926µs          0        0       0      1  list.json
       69.958µs      69.958µs      69.958µs          0        0       0      1  partials/header.html
       65.665µs      65.665µs      65.665µs          0        0       0      1  partials/footer.html
       62.705µs      62.705µs      62.705µs          0        0       0      1  404.html
       19.452µs       2.161µs       4.744µs        100      100       9      9  partials/header
       15.802µs       1.755µs       2.437µs        100      100       9      9  partials/footer

                   |   EN    
-------------------+---------
  Pages            |     11  
  Paginator pages  |      0  
  Non-page files   |      0  
  Static files     | 676158  
  Processed images |      0  
  Aliases          |      0  
  Cleaned          |      0  

Built in 426746 ms

Alloc = 184.4 MB
TotalAlloc = 9.2 GB
Sys = 571.9 MB
NumGC = 176

This is the template stats for the live build using hugo server.

ixt commented 6 months ago
Copy Entities Start 934
718004/720451 [00:44<00:00, 12934.40it/s]
Initial Processing complete.
208316
'trustscore/puçoles.json'
Rosetta Start 986
647/650 [00:04<00:00, 144.06it/s]
Copy Files Start 995
Tags Start 996
100%|[00:28<00:00, 27074.12it/s]End 1025