Closed mikofski closed 7 years ago
More details and examples:
Currently in a module, there is a pvcell
member that is a list of the cells in that module. The cell number is identified by cell position and its index in the list. If the module has 96 cells, then the list is 96 items long, and each item is a unique reference created by using copy.copy
.
One option is very simple, instead of creating cheap copies, when creating the module, just use the same reference. This is the simplest and quickest implementation since it requires the least amount of code changes. Change PVmodule
:
# only single pvcell instead of a list of pvcells
if isinstance(pvcells, PVcell):
# do NOT use copy, instead all indices point to the same pvcell reference!
pvcells = [pvcells] * self.numberCells
Another option, which may be helpful, or may actually be worse would be to add a new field to each object that maps cell indices to the references in the list, and make each cell reference unique. This requires a little more book keeping, so I'm starting to think it's a BAD idea, but I'll list it here anyway. Change PVmodule
:
# only single pvcell instead of a list of pvcells
if isinstance(pvcells, PVcell):
# do NOT use copy, instead make one cell!
pvcells = [pvcells]
cached_cells = [0] * self.numberCells # map that points to the cell index
# to get the desired cell use: pvc = pvmod.pvcells[pvmod.cached_cells[idx]]
# might be able to overload __get_item__ or something, or make a custom get_pvcell() method
# this method seems BAD to me now
I have implemented a solution at the PVmodule level. My solution is:
I would like to implement a similar scheme at the string and system level before making a pull request and closing this issue.
You can see my progress by looking at the better_memory
branch of my fork
can you open a PR to pvmismatch? you may need to make a fresh clone your entire repo, delete the github version, then fork pvmismatch again and finally push your clone back up to the new fork.
Already fixed the fork (my repo points to the main one again). I'll open a PR now and reference this issue. All tests passed, by the way.
@bmeyers I don't think a lot of this is necessary,see this comment
Did you get my email regarding this? I would like to adhere to a "light" ethic if possible: ie: the smallest changes possible and delegate as much as possible. Is that OK with you? IMO I think we can get the desired memory optimization with very few changes. Then let's see how things work out before make more changes.
@bmeyers re: why I would want to keep the changes simple and not introduce too many new methods and class attributes. Too many changes at once makes the code difficult to comprehend. PVMismatch has IMO an intuitive structure. What we're doing now is a little fancy, so if we can make it easy, then it will be easier to maintain and make other changes in the future. This ethic had really paid off well for past changes to PVMismatch like the restructure we did for changing pvcell
temperature and other cell properties and when I did the Tetris changes. Big picture, small changes.
@mikofski the philosophy makes perfect sense to me. I'll review the email you sent more closely before continuing on. And I'll definitely plan on moving the "memory optimizer" code into contrib.
@bmeyers have a look at this example - only 3 one-line changes resolve issue #35
>>> from pvmismatch import *
>>> pvsys = pvsystem.PVsystem()
>>> pvsys.pvstrs
[<pvmismatch.pvmismatch_lib.pvstring.PVstring at 0x7fb0a384be50>,
<pvmismatch.pvmismatch_lib.pvstring.PVstring at 0x7fb0a384be50>,
<pvmismatch.pvmismatch_lib.pvstring.PVstring at 0x7fb0a384be50>,
...]
>>> pvsys.pvstrs
[<pvmismatch.pvmismatch_lib.pvstring.PVstring at 0x7fb0a384be50>,
<pvmismatch.pvmismatch_lib.pvstring.PVstring at 0x7fb0a384be50>,
<pvmismatch.pvmismatch_lib.pvstring.PVstring at 0x7fb0a384be50>,
...]
[id(pvc) for pvc in pvsys.pvmods[0][0].pvcells]
[140396627329168,
140396627329168,
140396627329168,
140396627329168,
...]
>>> pvsys.plotSys()
notice that the pvstring and module hashes are the same for all strings, all modules, etc. ditto for cells
well sure, and that's exactly what I did for PVmodule. But then the tests fail because setSuns doesn't work correctly. Which is what i was working on the rest of the day.
see my next 3 commits, that fixes setSuns()
or issue #34 - I think that this could be implemented a couple different ways so feel free to experiment around with the best way. I really wanted to figure out a way to use set(list of objects)
to get the unique objects, but I ended up using the "cache" idea, but just simpler, turns out that dict.fromkeys(list of objects)
does the same thing as set(list of objects)
just instead of a frozen set, you get a dictionary of unique objects as keys, and values as None
, useful so I could store the new copy as the value, and use the initial None
state to know whether to copy or not.
I made some tests, but there not exhaustive, and I bet there are still some edge cases that still need fixin.
also I noticed in pvstring.py
that there's already a sneaky way to create a list of pvcells from a single irradiance by passing in a scalar as Ee
and a list as cell indices - so my ideas for this in #34 are probably unnecessary.
observed: large systems take up too much system memory
proposed: don't make separate copies of objects until absolutely necessary.
instead of making copies and tracking all objects as individuals, just reference the same copy of identical objects. for example: when making a module, rather than copying the cell 96 or 128 times, just reference the same cell until there are differences. Ditto for modules - if all the modules in a string are identical, just reference the same module rather than making cheap copies until needed. in other words only create as many objects, the minimum, as needed to describe the characteristics and condition of the system.
issues:
cached_cells
field as a dictionary.~ Instead, an easier way is to just keep the lists the same, and put the reference you want in there. see my 1st comment belowalso profiling to demonstrate this is an improvement over current methods is required.