vacanza / python-holidays

Generate and work with holidays in Python
https://pypi.org/project/holidays
MIT License
1.39k stars 446 forks source link

Maximize load performance #998

Closed arkid15r closed 1 year ago

arkid15r commented 1 year ago

We advertise this library as "fast, efficient", and is performance or memory footprint affected by loading every single class for every single country (that's 360+ classes) even when the user needs holidays only for a single country, which I assume this to be 90%+ of use cases, or just a few countries? What about on small footprints (e.g. Raspberry Pi)? I would assume so, but am not very familiar with Python innards.

If so, are there architectures that we can implement to maximize load performance especially when only one (or a few) countries are needed?

Originally posted by @mborsetti in https://github.com/dr-prodigy/python-holidays/issues/948#issuecomment-1432382359

mborsetti commented 1 year ago

Some thoughts

I can (slowly) work on all of this if there's interest.

adamchainz commented 1 year ago

I have found this issue because I'm profiling a project's startup time (as per this blog post), and holidays is one of the slower-to-import libraries. It takes 56ms on my machine, most of which is indeed wasted on loading countries which the project doesn't need.

Here's a profile:

python -X importtime -c 'import holidays'

import time: self [us] | cumulative | imported package
import time:       268 |        268 |   _io
import time:        40 |         40 |   marshal
import time:       539 |        539 |   posix
import time:       586 |       1430 | _frozen_importlib_external
import time:       198 |        198 |   time
import time:       142 |        340 | zipimport
import time:        94 |         94 |     _codecs
import time:       493 |        587 |   codecs
import time:       908 |        908 |   encodings.aliases
import time:      1493 |       2987 | encodings
import time:       365 |        365 | encodings.utf_8
import time:       293 |        293 | _signal
import time:        48 |         48 |     _abc
import time:       260 |        308 |   abc
import time:       358 |        666 | io
import time:        64 |         64 |       _stat
import time:       114 |        178 |     stat
import time:       944 |        944 |     _collections_abc
import time:        18 |         18 |       genericpath
import time:        40 |         58 |     posixpath
import time:       271 |       1449 |   os
import time:        36 |         36 |   _sitebuiltins
import time:      1174 |       1174 |   _distutils_hack
import time:       238 |        238 |   types
import time:       211 |        211 |       warnings
import time:       123 |        333 |     importlib
import time:       199 |        199 |     importlib._abc
import time:        74 |         74 |         itertools
import time:        78 |         78 |         keyword
import time:        48 |         48 |           _operator
import time:       217 |        264 |         operator
import time:       112 |        112 |         reprlib
import time:        82 |         82 |         _collections
import time:       771 |       1380 |       collections
import time:        85 |         85 |         _functools
import time:      1065 |       1150 |       functools
import time:       710 |       3239 |     contextlib
import time:       135 |       3904 |   importlib.util
import time:        69 |         69 |   importlib.machinery
import time:        66 |         66 |   sitecustomize
import time:     11430 |      18363 | site
import time:       193 |        193 |   holidays.constants
import time:       196 |        196 |         math
import time:       177 |        177 |         _datetime
import time:      1205 |       1578 |       datetime
import time:        91 |         91 |           dateutil._version
import time:       147 |        238 |         dateutil
import time:       131 |        368 |       dateutil.easter
import time:       159 |        159 |           collections.abc
import time:      1070 |       1070 |             enum
import time:        44 |         44 |               _sre
import time:       175 |        175 |                 re._constants
import time:       458 |        633 |               re._parser
import time:        69 |         69 |               re._casefix
import time:       249 |        993 |             re._compiler
import time:       144 |        144 |             copyreg
import time:       402 |       2607 |           re
import time:       109 |        109 |           _typing
import time:      1881 |       4755 |         typing
import time:        98 |         98 |               _bisect
import time:        97 |        195 |             bisect
import time:        61 |         61 |             hijri_converter.helpers
import time:      1125 |       1125 |             hijri_converter.locales
import time:       178 |        178 |             hijri_converter.ummalqura
import time:       260 |       1817 |           hijri_converter.convert
import time:       143 |       1960 |         hijri_converter
import time:       277 |       6991 |       holidays.calendars
import time:       137 |        137 |             _weakrefset
import time:       381 |        517 |           weakref
import time:        47 |         47 |               org
import time:        16 |         63 |             org.python
import time:         9 |         71 |           org.python.core
import time:       156 |        743 |         copy
import time:        62 |         62 |             _locale
import time:      3489 |       3550 |           locale
import time:       389 |       3939 |         calendar
import time:       617 |        617 |         gettext
import time:        94 |         94 |           fnmatch
import time:        59 |         59 |             _winapi
import time:        41 |         41 |             nt
import time:        35 |         35 |             nt
import time:        32 |         32 |             nt
import time:        51 |         51 |             nt
import time:       167 |        167 |             nt
import time:       134 |        516 |           ntpath
import time:        42 |         42 |           errno
import time:        85 |         85 |             urllib
import time:       993 |       1077 |           urllib.parse
import time:       616 |       2345 |         pathlib
import time:       116 |        116 |             __future__
import time:        20 |         20 |               _string
import time:       470 |        490 |             string
import time:       164 |        164 |                 _struct
import time:       141 |        304 |               struct
import time:       730 |       1033 |             six
import time:       223 |        223 |                 numbers
import time:       514 |        737 |               _decimal
import time:        88 |        824 |             decimal
import time:       616 |        616 |               dateutil._common
import time:       187 |        802 |             dateutil.relativedelta
import time:        24 |         24 |                 six.moves
import time:       140 |        140 |                 dateutil.tz._common
import time:       140 |        140 |                 dateutil.tz._factories
import time:        18 |         18 |                   six.moves.winreg
import time:       185 |        203 |                 dateutil.tz.win
import time:       849 |       1355 |               dateutil.tz.tz
import time:       118 |       1472 |             dateutil.tz
import time:       752 |       5487 |           dateutil.parser._parser
import time:       236 |        236 |           dateutil.parser.isoparser
import time:       258 |       5980 |         dateutil.parser
import time:       706 |      14327 |       holidays.holiday_base
import time:       377 |      23638 |     holidays.countries.albania
import time:       212 |        212 |       holidays.countries.united_states
import time:       160 |        372 |     holidays.countries.american_samoa
import time:       113 |        113 |     holidays.countries.andorra
import time:       165 |        165 |     holidays.countries.angola
import time:       335 |        335 |     holidays.countries.argentina
import time:       103 |        103 |     holidays.countries.armenia
import time:       210 |        210 |     holidays.countries.aruba
import time:       122 |        122 |     holidays.countries.australia
import time:       112 |        112 |     holidays.countries.austria
import time:       136 |        136 |     holidays.countries.azerbaijan
import time:       108 |        108 |     holidays.countries.bahrain
import time:        82 |         82 |     holidays.countries.bangladesh
import time:       329 |        329 |     holidays.countries.belarus
import time:       127 |        127 |     holidays.countries.belgium
import time:       136 |        136 |     holidays.countries.bolivia
import time:       150 |        150 |     holidays.countries.bosnia_and_herzegovina
import time:       124 |        124 |     holidays.countries.botswana
import time:       125 |        125 |     holidays.countries.brazil
import time:       117 |        117 |     holidays.countries.bulgaria
import time:       112 |        112 |     holidays.countries.burundi
import time:       164 |        164 |     holidays.countries.canada
import time:       122 |        122 |         pymeeus
import time:        74 |         74 |         pymeeus.base
import time:       261 |        261 |         pymeeus.Angle
import time:       502 |        959 |       pymeeus.Epoch
import time:       139 |        139 |           pymeeus.Interpolation
import time:       569 |        707 |         pymeeus.Coordinates
import time:      1136 |       1136 |         pymeeus.Earth
import time:       190 |       2031 |       pymeeus.Sun
import time:       328 |        328 |           sysconfig
import time:       476 |        476 |           _sysconfigdata__linux_aarch64-linux-gnu
import time:      1204 |       2007 |         zoneinfo._tzpath
import time:       843 |        843 |         zoneinfo._common
import time:       133 |        133 |         _zoneinfo
import time:       423 |       3405 |       zoneinfo
import time:       234 |       6627 |     holidays.countries.chile
import time:       106 |        106 |     holidays.countries.china
import time:       125 |        125 |     holidays.countries.colombia
import time:        87 |         87 |     holidays.countries.croatia
import time:        91 |         91 |     holidays.countries.cuba
import time:       104 |        104 |     holidays.countries.curacao
import time:        80 |         80 |     holidays.countries.cyprus
import time:        99 |         99 |     holidays.countries.czechia
import time:        83 |         83 |     holidays.countries.denmark
import time:       107 |        107 |     holidays.countries.djibouti
import time:       489 |        489 |     holidays.countries.dominican_republic
import time:        98 |         98 |     holidays.countries.egypt
import time:       107 |        107 |     holidays.countries.estonia
import time:       100 |        100 |     holidays.countries.eswatini
import time:       175 |        175 |     holidays.countries.ethiopia
import time:       143 |        143 |     holidays.countries.finland
import time:       137 |        137 |     holidays.countries.france
import time:       113 |        113 |     holidays.countries.georgia
import time:       126 |        126 |     holidays.countries.germany
import time:       199 |        199 |     holidays.countries.greece
import time:       184 |        184 |     holidays.countries.guam
import time:       133 |        133 |     holidays.countries.honduras
import time:       141 |        141 |     holidays.countries.hongkong
import time:       186 |        186 |     holidays.countries.hungary
import time:       124 |        124 |     holidays.countries.iceland
import time:       117 |        117 |     holidays.countries.india
import time:       196 |        196 |     holidays.countries.indonesia
import time:       149 |        149 |     holidays.countries.ireland
import time:       180 |        180 |       holidays.countries.united_kingdom
import time:       107 |        286 |     holidays.countries.isle_of_man
import time:        94 |         94 |             convertdate.utils
import time:       160 |        253 |           convertdate.gregorian
import time:        65 |         65 |           convertdate.julian
import time:       154 |        471 |         convertdate.armenian
import time:        89 |         89 |         convertdate.bahai
import time:        58 |         58 |         convertdate.coptic
import time:        56 |         56 |           convertdate.julianday
import time:       114 |        170 |         convertdate.daycount
import time:        46 |         46 |         convertdate.dublin
import time:        74 |         74 |             convertdate.data
import time:       169 |        243 |           convertdate.data.french_republican_days
import time:        99 |        342 |         convertdate.french_republican
import time:        95 |         95 |         convertdate.hebrew
import time:        78 |         78 |           convertdate.islamic
import time:       203 |        280 |         convertdate.holidays
import time:        83 |         83 |         convertdate.indian_civil
import time:       129 |        129 |         convertdate.iso
import time:       122 |        122 |         convertdate.mayan
import time:        92 |         92 |         convertdate.persian
import time:       505 |        505 |           convertdate.data.positivist
import time:        94 |        598 |         convertdate.positivist
import time:        62 |         62 |         convertdate.ordinal
import time:       326 |       2956 |       convertdate
import time:       132 |       3088 |     holidays.countries.israel
import time:       164 |        164 |     holidays.countries.italy
import time:        80 |         80 |     holidays.countries.jamaica
import time:       236 |        236 |     holidays.countries.japan
import time:        84 |         84 |     holidays.countries.kazakhstan
import time:       101 |        101 |     holidays.countries.kenya
import time:        87 |         87 |     holidays.countries.kyrgyzstan
import time:        87 |         87 |     holidays.countries.latvia
import time:        88 |         88 |     holidays.countries.lesotho
import time:        93 |         93 |     holidays.countries.liechtenstein
import time:        74 |         74 |     holidays.countries.lithuania
import time:       101 |        101 |     holidays.countries.luxembourg
import time:        71 |         71 |     holidays.countries.madagascar
import time:        70 |         70 |     holidays.countries.malawi
import time:       235 |        235 |     holidays.countries.malaysia
import time:        75 |         75 |     holidays.countries.malta
import time:       110 |        110 |     holidays.countries.marshall_islands
import time:        64 |         64 |     holidays.countries.mexico
import time:        85 |         85 |     holidays.countries.moldova
import time:       123 |        123 |     holidays.countries.monaco
import time:        87 |         87 |     holidays.countries.montenegro
import time:        80 |         80 |     holidays.countries.morocco
import time:        86 |         86 |     holidays.countries.mozambique
import time:        87 |         87 |     holidays.countries.namibia
import time:        92 |         92 |     holidays.countries.netherlands
import time:       121 |        121 |     holidays.countries.new_zealand
import time:       100 |        100 |     holidays.countries.nicaragua
import time:       104 |        104 |     holidays.countries.nigeria
import time:        83 |         83 |     holidays.countries.north_macedonia
import time:        86 |         86 |     holidays.countries.northern_mariana_islands
import time:        97 |         97 |           _heapq
import time:       152 |        248 |         heapq
import time:       371 |        619 |       dateutil.rrule
import time:       108 |        726 |     holidays.countries.norway
import time:       169 |        169 |     holidays.countries.pakistan
import time:        70 |         70 |     holidays.countries.panama
import time:        80 |         80 |     holidays.countries.paraguay
import time:        83 |         83 |     holidays.countries.peru
import time:        71 |         71 |     holidays.countries.philippines
import time:       223 |        223 |     holidays.countries.poland
import time:       190 |        190 |     holidays.countries.portugal
import time:       148 |        148 |     holidays.countries.puerto_rico
import time:       110 |        110 |     holidays.countries.romania
import time:       105 |        105 |     holidays.countries.russia
import time:        98 |         98 |     holidays.countries.san_marino
import time:       110 |        110 |     holidays.countries.saudi_arabia
import time:        93 |         93 |     holidays.countries.serbia
import time:       288 |        288 |     holidays.countries.singapore
import time:        87 |         87 |     holidays.countries.slovakia
import time:       114 |        114 |     holidays.countries.slovenia
import time:        89 |         89 |     holidays.countries.south_africa
import time:       214 |        214 |         korean_lunar_calendar.korean_lunar_calendar
import time:       120 |        333 |       korean_lunar_calendar
import time:       164 |        497 |     holidays.countries.south_korea
import time:       178 |        178 |     holidays.countries.spain
import time:       134 |        134 |     holidays.countries.sweden
import time:        90 |         90 |     holidays.countries.switzerland
import time:        92 |         92 |     holidays.countries.taiwan
import time:       520 |        520 |     holidays.countries.thailand
import time:       123 |        123 |     holidays.countries.tunisia
import time:       152 |        152 |     holidays.countries.turkey
import time:       148 |        148 |     holidays.countries.ukraine
import time:        88 |         88 |     holidays.countries.united_arab_emirates
import time:       110 |        110 |     holidays.countries.united_states_minor_outlying_islands
import time:        92 |         92 |     holidays.countries.united_states_virgin_islands
import time:       160 |        160 |     holidays.countries.uruguay
import time:       102 |        102 |     holidays.countries.uzbekistan
import time:        76 |         76 |     holidays.countries.vatican_city
import time:        70 |         70 |     holidays.countries.venezuela
import time:       176 |        176 |     holidays.countries.vietnam
import time:       122 |        122 |     holidays.countries.zambia
import time:        81 |         81 |     holidays.countries.zimbabwe
import time:      1783 |      51229 |   holidays.countries
import time:       110 |        110 |     holidays.financial.european_central_bank
import time:       125 |        125 |     holidays.financial.ny_stock_exchange
import time:       109 |        343 |   holidays.financial
import time:        47 |         47 |         _ast
import time:       856 |        902 |       ast
import time:        93 |         93 |           _opcode
import time:       220 |        312 |         opcode
import time:       549 |        861 |       dis
import time:        90 |         90 |           token
import time:       578 |        668 |         tokenize
import time:       112 |        780 |       linecache
import time:      1343 |       3885 |     inspect
import time:       174 |       4058 |   holidays.utils
import time:       581 |      56402 | holidays

Lazy-loading these modules would be the biggest win for speed, I think. This may involve moving some attributes like the alpha codes into a centralized dictionary, rather than pulling them from the classes, since you want to avoid importing the class until requested.

mborsetti commented 1 year ago

I can (slowly) work on all of this if there's interest.

@arkid15r Any interest? We'd probably be looking at a transition architecture with deprecation followed by a much faster v1.0.

arkid15r commented 1 year ago

@arkid15r Any interest? We'd probably be looking at a transition architecture with deprecation followed by a much faster v1.0.

Hey Mike, absolutely! This idea is great! The library performance is the second highest priority for us. I didn't look into any implementation details yet and at this point I can only see potential compatibility issues that could be a show stopper for this. I'm looking forward to seeing any early stage PR(s) or PoC code to discuss specifics.

Thank you!

// @adamchainz I appreciate your input too!

mborsetti commented 1 year ago

Hey Mike, absolutely! This idea is great! The library performance is the second highest priority for us. I didn't look into any implementation details yet and at this point I can only see potential compatibility issues that could be a show stopper for this. I'm looking forward to seeing any early stage PR(s) or PoC code to discuss specifics.

Hi Arkadii, I think I found a way to solve all three issues simply and elegantly ,and still be backwards compatible. However, I hit an undocumented function argument, which you added; can you please see my comments in https://github.com/dr-prodigy/python-holidays/pull/878?

arkid15r commented 1 year ago

Hey Mike, that sounds promising! I'm eager to see your solution! I replied to your comments for #878, let me know if you still have questions.

arkid15r commented 1 year ago

The v0.25 with the lazy loading implementation has just been released.

adamchainz commented 1 year ago

👍 Thank you for doing this.

python -X importtime -c 'import holidays' now shows 29ms for me, compared to 56ms before. A 27ms saving, and most of the time is now in Python standard library modules.

arkid15r commented 1 year ago

Thanks for your input and the feedback @adamchainz