pcdshub / transfocate

Automated calculation of beryllium lens focusing optics
https://pcdshub.github.io/transfocate
Other
1 stars 7 forks source link

Import slowness diagnosis #55

Closed ZLLentz closed 2 years ago

ZLLentz commented 2 years ago

I've been trying to track down and eliminate startup slowness in our python environments. This transfocate module has a slow import so I wanted to break it down and see if there was anything we can do to fix it.

First, the top of the current happi benchmark --sort_key import_time output:

+--------------------------+-----------------------+------------+-----------------------+-----------------------+------------------------+
|           name           |        avg_time       | iterations |        tot_time       |        max_time       |      import_time       |
+--------------------------+-----------------------+------------+-----------------------+-----------------------+------------------------+
|         mfx_tfs          |  0.01790064200758934  |     1      |  0.01790064200758934  |  0.01790064200758934  |   4.252603299915791    |
|        mr1l3_homs        |  0.004378553479909897 |     1      |  0.004378553479909897 |  0.004378553479909897 |   0.9409196451306343   |
|          al1k2           | 0.0033328868448734283 |     1      | 0.0033328868448734283 | 0.0033328868448734283 |  0.39289555326104164   |
|         tmo_lamp         |  0.001996569335460663 |     1      |  0.001996569335460663 |  0.001996569335460663 |  0.03505022078752518   |

Okay, so it takes >4s to import this module when we create the mfx transfocator. Let's break down why this happens.

$ python -m pcdsutils.import_timer transfocate
+----------------------------------------+-----------+-----------------+
|              root_module               | self_time | cumulative_time |
+----------------------------------------+-----------+-----------------+
|              transfocate               |  0.839122 |     5.269423    |
|             pkg_resources              |  0.670678 |     0.68067     |
|              pcdsdevices               |  0.543822 |     1.25773     |
|                 ophyd                  |  0.524668 |     1.364044    |
|                 pandas                 |  0.501671 |     0.656044    |
|                openpyxl                |  0.264331 |     0.348003    |
|                pcdscalc                |  0.159113 |     0.155689    |
|                networkx                |  0.156738 |     0.163776    |
|               pcdsutils                |  0.13592  |     0.14277     |
|                 numpy                  |  0.123308 |     0.145757    |
...
$ python -m pcdsutils.import_timer transfocate --show-module transfocate
+-----------------------------+-------------+-----------+-----------------+--------------+
|            module           | root_module | self_time | cumulative_time | indent_level |
+-----------------------------+-------------+-----------+-----------------+--------------+
|            Total            | transfocate |  0.788298 |     6.958058    |     N/A      |
|    transfocate.table.info   | transfocate |  0.763795 |     2.196944    |      2       |
|   transfocate.transfocator  | transfocate |  0.008334 |     0.462072    |      1       |
|    transfocate.calculator   | transfocate |  0.006343 |     0.006343    |      1       |
|       transfocate.lens      | transfocate |  0.002417 |     4.110235    |      1       |
|      transfocate.table      | transfocate |  0.001672 |     2.377045    |      1       |
|         transfocate         | transfocate |  0.001438 |     6.958058    |      0       |
|    transfocate.table.ioc    | transfocate |  0.001245 |     0.001245    |      2       |
|   transfocate.table.report  | transfocate |  0.001106 |     0.176166    |      2       |
| transfocate.table.plc_table | transfocate |  0.00102  |     0.00102     |      2       |
|     transfocate._version    | transfocate |  0.000928 |     0.000928    |      1       |
+-----------------------------+-------------+-----------+-----------------+--------------+
$ python -m pcdsutils.import_timer transfocate --show-chain pandas
Import chain for importing dependency pandas:
transfocate
transfocate.table
transfocate.table.info
pandas

From other projects I know that we can't really fix the pkg_resources import. Some of these imports are red herrings because importing a dev package is slow, but we can't avoid most of them anyway. There are two things in this output that seem easily fixable to me:

  1. Importing pandas is slow, and is only done in the table submodule, and is only done when loading the transfocator device because of this line in the top-level __init__: https://github.com/pcdshub/transfocate/blob/7d6f7ba3d6fadb09114c63da39a80325a625980a/transfocate/__init__.py#L6
  2. Importing transfocate.table.info is slow because it reads and parses a spreadsheet on import. This could be adjusted to only run when first needed, but it also could be avoided with the same fix as point 1.
  3. There are other dependencies that have slow imports here that are also only used in the table submodule

The table module is used when running the automated checkout to generate nice reports and is valuable, but there is no need to import it when running the transfocator in a hutch python session.

So I think this means that I made a giant issue to suggest a 1-line edit to remove a single top-level import... I'll do this myself after lunch and after the after lunch meetings pending other thoughts.

ZLLentz commented 2 years ago

One more thought before I go make my one-line PR: does importing a submodule in __init__.py do anything functionally either than forcing the on-import code to run? I understand that importing resources from a module into init gives us some new top-level imports, but importing a module into init seems to just... create a duplicate route to the same namespace? Am I missing something here?

klauer commented 2 years ago

It was for convenience, as the following won't work without that submodule import:

import transfocate
transfocate.table.x

Instead

import transfocate.table
transfocate.table.x
ZLLentz commented 2 years ago

Ok, that makes a lot of sense and I'll keep it in mind for future cases. I've double-checked that in this case it isn't used within the transfocate module in the current implementation- it must have been used in a previous iteration.