Closed liangwang0734 closed 6 years ago
Profiling results for loading 300 time steps with 4 patches per step:
>>> %prun viscid.load_file('./global.vpc')
5046121 function calls (4792979 primitive calls) in 5.138 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
64238 0.503 0.000 0.830 0.000 bucket.py:71(set_item)
64844 0.470 0.000 1.194 0.000 tree.py:19(__init__)
315156/63028 0.399 0.000 0.448 0.000 vutil.py:101(subclass_spider)
63024 0.395 0.000 1.916 0.000 field.py:472(__init__)
1 0.349 0.349 5.136 5.136 vpic.py:428(_parse)
63024 0.285 0.000 3.069 0.000 field.py:342(wrap_field)
128849 0.182 0.000 0.401 0.000 npdatetime.py:729(_check_like)
63024 0.179 0.000 0.837 0.000 field.py:320(field_type)
63027 0.178 0.000 0.178 0.000 {method 'items' of 'dict' objects}
128844 0.160 0.000 0.160 0.000 {built-in method builtins.hasattr}
63024 0.144 0.000 3.269 0.000 vfile.py:195(_make_field)
315120 0.136 0.000 0.169 0.000 field.py:1433(istype)
65147 0.129 0.000 0.704 0.000 tree.py:232(time)
64238 0.126 0.000 0.275 0.000 bucket.py:53(_make_hashable)
777426 0.120 0.000 0.120 0.000 {built-in method builtins.isinstance}
131224 0.119 0.000 0.119 0.000 {method 'format' of 'str' objects}
63024 0.108 0.000 1.046 0.000 grid.py:262(__setitem__)
63024 0.101 0.000 0.137 0.000 vpic.py:637(__init__)
64237 0.080 0.000 0.920 0.000 bucket.py:198(__setitem__)
65183 0.074 0.000 0.224 0.000 npdatetime.py:782(is_timedelta_like)
63024 0.072 0.000 1.118 0.000 grid.py:104(add_field)
Thoughts on reading vpic fields into a single viscid.field.Field
:
A VPIC_DataWrapper could be given a list of file_wrappers, along with information about the destination of each block in the global field. A worthwhile question to ask is: are you more likely to be bitten by the cost of assembling global fields on access, or the one-time cost of creating a large number of field objects. I'm sure it depends on how the domain is decomposed and how the data is being used. Maybe some kind of switch is needed to let the user decide on a case-by-case basis.
On the file opening, my results on a 1024x1x1 patch vpic run is shown below. Can we make/wrap the fields only when it is used for the first time, or do I have to avoid making those field getters in the reader ( I remember you suggested me to be cautious on the field getters, but I could not find the source)?
50979356 function calls (48415072 primitive calls) in 104.067 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
651290 14.426 0.000 31.925 0.000 tree.py:19(__init__)
1302609 8.023 0.000 8.023 0.000 {built-in method builtins.hasattr}
638982 7.634 0.000 7.634 0.000 {method 'items' of 'dict' objects}
651266 7.321 0.000 13.985 0.000 bucket.py:71(set_item)
3204096/640000 7.187 0.000 8.183 0.000 vutil.py:101(subclass_spider)
1 6.364 6.364 104.063 104.063 vpic.py:428(_parse)
638976 5.835 0.000 47.915 0.000 field.py:472(__init__)
638976 5.704 0.000 68.635 0.000 field.py:342(wrap_field)
1327142 3.066 0.000 3.066 0.000 {method 'format' of 'str' objects}
638976 2.845 0.000 14.423 0.000 field.py:320(field_type)
7898636 2.747 0.000 2.747 0.000 {built-in method builtins.isinstance}
1308734 2.712 0.000 12.120 0.000 npdatetime.py:729(_check_like)
638976 2.480 0.000 71.879 0.000 vfile.py:195(_make_field)
651302 2.350 0.000 17.075 0.000 tree.py:232(time)
3194880 2.053 0.000 2.676 0.000 field.py:1433(istype)
651266 1.990 0.000 5.720 0.000 bucket.py:53(_make_hashable)
638976 1.818 0.000 17.564 0.000 grid.py:262(__setitem__)
638976 1.795 0.000 2.476 0.000 vpic.py:637(__init__)
1953798 1.442 0.000 1.442 0.000 {built-in method builtins.hash}
638976 1.278 0.000 18.842 0.000 grid.py:104(add_field)
5764109 1.251 0.000 1.251 0.000 {method 'lower' of 'str' objects}
651265 1.245 0.000 15.480 0.000 bucket.py:198(__setitem__)
660518 1.124 0.000 4.111 0.000 npdatetime.py:782(is_timedelta_like)
660518 1.077 0.000 10.440 0.000 npdatetime.py:769(is_datetime_like)
In principle, the grid can defer field creation, but in practice, it'll end up having to override every method that tries to get metadata / coordinates / fields, which sounds rather error prone. Ultimately, something has to keep track of each patch because each patch is unique: it lives in a specific block of a specific binary file and has specific local domain extents. The issue appears to be that Viscid's whole hierarchal structure has initialization overhead that scales linearly with the number of patches.
Maybe the whole tree / bucket infrastructure could be streamlined... the structure as it exists now grew organically as different workflows needed different features. But, I'm suspicious that there may not be much savings to be had with this approach since percall overhead is already of order microseconds.
Another solution might be: one Field per physical quantity per timestep, and have the DataWrapper deal with assembly on access (like I mentioned above). Then initialization overhead should scale more like number of physical quantities
⨉ number of time steps
.
I can't imagine init scaling better than number of physical quantities
⨉ number of time steps
without significant effort. To do this, one would need custom dataset objects that defer metadata creation, but still know about all the timesteps / physical quantities. Feel free to investigate this, but it sounds like edge-case-city.
Ok, maybe I'm being a little pessimistic. In principle, a deferred hierarchy might not be too hard, one just needs to make the following:
Now that I write this all out, it might be convenient if the base Dataset / Grid classes accepted a callback function for deferred child creation. That could probably simplify a fair amount of existing reader hacks.
There are few issues:
vpic.py:VPIC_File:_parse
, can we avoid the loop over time but reuse the first frame's grid for all time?vpyplot.py:plot
would make a subplot for each block. The overhead for the new subplot can be substantial if the number of blocks is large, even if the actual data size is not. It would truly ideal if we can assemble the data into one array and make one plot.seed.RectilinearMeshPoints
.