vavrines / Kinetic.jl

Universal modeling and simulation of fluid mechanics upon machine learning. From the Boltzmann equation, heading towards multiscale and multiphysics flows.
https://xiaotianbai.com/Kinetic.jl/dev
MIT License
131 stars 15 forks source link

Loading time #23

Closed henry2004y closed 1 year ago

henry2004y commented 1 year ago

Kinetic.jl seems to have a long list of dependencies:

Importing time

```julia julia> @time_imports using Kinetic 0.1 ms Reexport 4.7 ms BSON 8.7 ms OrderedCollections 38.4 ms MacroTools 2.0 ms TranscodingStreams 0.3 ms Requires 209.2 ms FileIO 4.95% compilation time (15% recompilation) 294.2 ms JLD2 0.2 ms SnoopPrecompile 70.2 ms Parsers 10.73% compilation time 0.1 ms DataValueInterfaces 1.6 ms DataAPI 0.1 ms IteratorInterfaceExtensions 0.1 ms TableTraits 17.7 ms Tables 27.8 ms PooledArrays 102.2 ms SentinelArrays 8.6 ms InlineStrings 23.5 ms WeakRefStrings 0.2 ms Zlib_jll 1.7 ms CodecZlib 0.3 ms Compat 15.6 ms FilePathsBase 26.48% compilation time 1783.6 ms CSV 88.95% compilation time 4.1 ms CEnum 25.3 ms Preferences 0.4 ms JLLWrappers 265.1 ms LLVMExtra_jll 99.11% compilation time (99% recompilation) 178.7 ms LLVM 40.89% compilation time (100% recompilation) 0.3 ms ExprTools 91.1 ms TimerOutputs 16.86% compilation time 448.4 ms GPUCompiler 1.51% compilation time 0.4 ms Adapt 3.4 ms GPUArraysCore 794.6 ms GPUArrays 6.8 ms BFloat16s 0.3 ms CompilerSupportLibraries_jll 32.7 ms RandomNumbers 31.28% compilation time 9.9 ms Random123 76.8 ms ChainRulesCore 10.5 ms AbstractFFTs 1542.1 ms CUDA 0.69% compilation time 5.6 ms ArrayInterfaceCore 5.1 ms StaticArraysCore 763.9 ms StaticArrays 3.6 ms DocStringExtensions 60.31% compilation time 22.1 ms FunctionWrappers 0.3 ms MuladdMacro 0.1 ms UnPack 0.6 ms Parameters 28.0 ms FiniteDiff 34.90% compilation time 10.4 ms IrrationalConstants 2.3 ms DiffRules 5.1 ms DiffResults 0.2 ms OpenLibm_jll 0.4 ms NaNMath 1.1 ms ChangesOfVariables 1.8 ms InverseFunctions 0.9 ms LogExpFunctions 0.4 ms OpenSpecFun_jll 27.1 ms SpecialFunctions 0.4 ms CommonSubexpressions 185.1 ms ForwardDiff 1.6 ms ConstructionBase 70.7 ms Setfield 248.7 ms RecipesBase 0.3 ms ZygoteRules 0.6 ms ArrayInterfaceStaticArraysCore 234.3 ms FillArrays 50.5 ms RecursiveArrayTools 39.5 ms IterativeSolvers 0.1 ms IfElse 57.6 ms Static 27.6 ms ArrayInterface 128.3 ms OffsetArrays 0.8 ms ArrayInterfaceOffsetArrays 1.1 ms ArrayInterfaceStaticArrays 0.2 ms SIMDTypes 3.4 ms ManualMemory 7.4 ms LayoutPointers 1.3 ms CpuId 310.8 ms CPUSummary 89.71% compilation time 0.1 ms BitTwiddlingConvenienceFunctions 126.8 ms HostCPUFeatures 26.10% compilation time 603.0 ms VectorizationBase 5.8 ms SLEEFPirates 78.3 ms ThreadingUtilities 76.31% compilation time 23.7 ms PolyesterWeave 68.56% compilation time 9.5 ms CloseOpenIntervals 6.8 ms SIMDDualNumbers 387.1 ms LoopVectorization 232.9 ms StrideArraysCore 1.79% compilation time 1.5 ms Polyester 223.9 ms TriangularSolve 4.28% compilation time 643.8 ms RecursiveFactorization 0.2 ms CommonSolve 1.3 ms FunctionWrappersWrappers 1.3 ms RuntimeGeneratedFunctions 0.3 ms EnumX 263.0 ms SciMLBase 10.6 ms NonlinearSolve 0.6 ms FastBroadcast 149.9 ms DataStructures 0.5 ms SortingAlgorithms 14.8 ms Missings 0.4 ms StatsAPI 51.3 ms StatsBase 39.8 ms PDMats 217.4 ms Rmath_jll 99.76% compilation time (100% recompilation) 95.9 ms Rmath 88.76% compilation time 3.3 ms Calculus 175.7 ms DualNumbers 1.2 ms HypergeometricFunctions 8.2 ms StatsFuns 5.7 ms QuadGK 5.2 ms DensityInterface 397.7 ms Distributions 0.2 ms Tricks 521.9 ms DiffEqBase 4.40% compilation time 7.3 ms SimpleTraits 12.4 ms ArnoldiMethod 1.1 ms Inflate 109.6 ms Graphs 1.1 ms VertexSafeGraphs 24.7 ms SparseDiffTools 34.71% compilation time 1586.9 ms ArrayLayouts 357.1 ms BandedMatrices 41.8 ms NNlib 49.64% compilation time 224.8 ms MatrixFactorizations 150.1 ms LazyArrays 265.2 ms BlockArrays 132.2 ms BlockBandedMatrices 260.2 ms LazyBandedMatrices 582.1 ms DiffEqOperators 1.31% compilation time 0.6 ms FastGaussQuadrature 0.7 ms FFTW_jll 669.5 ms FFTW 4.14% compilation time (100% recompilation) 499.6 ms MutableArithmetics 175.6 ms MultivariatePolynomials 14.1 ms NLSolversBase 2.4 ms PositiveFactorizations 23.2 ms LineSearches 77.3 ms Optim 19.4 ms VersionParsing 46.0 ms JSON 18.5 ms Conda 946.0 ms PyCall 54.10% compilation time (6% recompilation) 59.1 ms StructArrays 54.4 ms TypedPolynomials 1178.9 ms Libiconv_jll 99.95% compilation time (100% recompilation) 0.5 ms XML2_jll 6.8 ms LightXML 11.8 ms WriteVTK 8.8 ms ProgressMeter 5.0 ms FiniteMesh [ Info: Kinetic will run with 1 worker and 1 thread 3786.0 ms KitBase 29.42% compilation time (70% recompilation) 9.2 ms InvertedIndices 1.0 ms Formatting 317.3 ms StringManipulation 300.3 ms Crayons 1.1 ms LaTeXStrings 773.9 ms PrettyTables 9202.8 ms DataFrames 5.9 ms ProgressLogging 19.0 ms ShowCases 74.9 ms InitialValues 1802.3 ms BangBang 96.43% compilation time (97% recompilation) 3.1 ms ContextVariablesX 0.1 ms FLoopsBase 4.5 ms PrettyPrint 0.6 ms NameResolution 358.6 ms MLStyle 4.7 ms JuliaVariables 0.6 ms ArgCheck 336.6 ms Baselet 0.1 ms CompositionsBase 0.1 ms DefineSingletons 37.0 ms MicroCollections 22.3 ms SplittablesBase 174.5 ms Transducers 34.79% compilation time (36% recompilation) 13.3 ms FLoops 130.3 ms Accessors 18.74% compilation time (33% recompilation) 629.1 ms FoldsThreads 87.87% compilation time (4% recompilation) 20.1 ms MLUtils 6.3 ms Functors 37.8 ms Optimisers 0.2 ms RealDot 45.2 ms ChainRules 331.8 ms IRTools 1342.0 ms Zygote 3.51% compilation time 112.7 ms OneHotArrays 129.0 ms NNlibCUDA 517.2 ms Flux 2.8 ms ConsoleProgressMonitor 58.2 ms AbstractTrees 9.6 ms LeftChildRightSiblingTrees 0.9 ms TerminalLoggers 10.8 ms LoggingExtras 748.1 ms Optimization 89.79% compilation time (74% recompilation) 515.2 ms OptimizationFlux 0.6 ms OptimizationOptimJL 0.3 ms OptimizationOptimisers 424.5 ms OptimizationPolyalgorithms 5.5 ms UnsafeAtomics 36.8 ms Atomix 0.2 ms UnsafeAtomicsLLVM 12.9 ms KernelAbstractions 0.7 ms CUDAKernels 68.6 ms LuxLib 605.7 ms ComponentArrays 27.41% compilation time 12609.1 ms Lux 0.18% compilation time 415.8 ms Tracker 62.23% compilation time (89% recompilation) 456.8 ms Solaris 478.2 ms KitML 480.6 ms Kinetic ```

Is it possible to somehow cut this down? For example, Lux loading time may be completely avoided if we do not use the machine learning related functionalities?

vavrines commented 1 year ago

It is exactly for this reason that I split it into submodules. For machine learning-free methods, just use KitBase.jl. Should be much better.

henry2004y commented 1 year ago

I see. I got the wrong impression at first that KitBase, with a similar name to many other packages, consists of only basic types. Maybe I got confused with the name xxxCore as well :sweat:. Feel free to close this issue.