pvlib / pvlib-python

A set of documented functions for simulating the performance of photovoltaic energy systems.
https://pvlib-python.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.22k stars 1.02k forks source link

Organize pvlib.data #1056

Open cwhanse opened 4 years ago

cwhanse commented 4 years ago

pvlib.data currently contains 1) databases for module and inverter models; 2) Linke turbidity values; 3) data files for tests and examples and 4) variable_style_rules.csv. Accurately described as a "junk drawer." The majority of the data files are in category 3, supporting tests.

As a start, maybe create data.tests and perhaps subfolders within tests that mirror the subfolders in pvlib.tests And perhaps add a prefix or other text to file names to help identify where or how it is used, e.g., PVsyst_demo.csv becomes test_sdm_pvsyst_demo.csv.

wholmgren commented 4 years ago

Mirroring the structure would be an improvement.

Another option to consider is moving the subpackage tests into the subpackage, along with a data subdirectory within that test directory. For example:

pvlib/data  # databases, linke turbidity, anything a user might need
pvlib/tests  # test_atmosphere.py, etc
pvlib/tests/data  # singleaxis_tracker_wslope.csv, etc
pvlib/iotools/tests  # test_tmy.py, etc
pvlib/iotools/tests/data  # pvgis_tmy_test.dat, etc
pvlib/ivtools/tests
pvlib/ivtools/tests/data

I proposed a similar structure when we created pvlib/tests/iotools. I was outvoted but I still think it's better!

cwhanse commented 4 years ago

I'm in favor of pvlib/tests/data, etc. rather than pvlib/data/tests.

echedey-ls commented 1 month ago

68 files to organize... I propose splitting up the work to make it manageable. I'd say the first step is to categorize each file. Feel free to react to this message with a 👍 if you want to contribute to that. I'll split the work in ranges for each one to work on it by mentioning you in this message next week.

but for those brave enough, you can work on it now: https://docs.google.com/spreadsheets/d/12LeEFa9-wRqc3v7utfgcTk96KTmaWfhHSPkLx6K0QhY/edit?usp=sharing

There are five categories, four for what Cliff said in this issue plus one if it's unknown; multiple can be selected for each entry. My way to go would be to look up where this files are mentioned and select the appropriate labels. Could be automated, but I don't feel like overengineering today - and that wouldn't take into account files not mentioned anywhere (if any).

React 👎 if you are against doing it this way (and potentially have a better idea)

cwhanse commented 1 month ago

@echedey-ls can you add "Lookup table" or something like that to the pull down options? For files like the CEC module parameters.

echedey-ls commented 1 month ago

This is a summary of the classification, available in the spreadsheet's third sheet.

image

AdamRJensen commented 1 month ago

This is a summary of the classification, available in the spreadsheet's third sheet.

@echedey-ls This looks good to me.

Should we create a sub-folder for the files only used for testing, i.e., the files that can be excluded?