Closed hagne closed 8 years ago
Lets try that mind map tool: https://www.mindmup.com/#m:a14a3d6860444d0133d0fa657bdfa6a28f
Wow...this is pretty old now. Still want to try mindmap?
Bump with new address... (sorry, don't really how to use this thing yet)
https://www.mindmup.com/#m:a1b616f4d096eb013322ac19bf3939d828
Are you guys trying to organize the project a bit better? Only thing have so far is maybe putting "aerosols", "mie" and "rayleigh" all in one directory "models" or "theory" or something like that to bundle theoretical/modeling stuff together?
Did you see the mindmup?
Yes, we are trying to reorganize the project. Right now, we don't really have a philosophy regarding the organization. Hagen and I have mostly just discussed it. There is now a branch (NewStructure) of the master where I am going to start rearranging. The repository as it stands is not really safe to fork yet given the terrible condition it is in (too much flux). I think that when
we will be in a good position. Here, we are just discussing the organization. Basically, we want this to be a package that could be registered with conda, pypi or other package distributors. So, to that end, the top level directory for distribution will be atmPy
. And that is where this discussion is concerned - what should the structure of the files be in this root directory.
Gotcha, maybe best thing for me to meet with you guys for lunch as you suggested and get a general feel for the project?
Sounds good...
That being said, I just want to move forward, so if I get time, I am going to move ahead on the branch NewStructure
when I get time...
I like your basic structure.
I extended your mindmap a little bit : https://www.mindmup.com/#m:a1402601009bc20133d67429457eab367d
Sorry, this is all over the place and I will likely contradict myself at some point below, but there is a lot to process here. I have a proposal though: if the routine(s) are related to processing data from a non-OTS set of software from an instrument, then THIS SHOULD NOT BE IN THE LIBRARY. In this, I would include non-OTS related instrument routines also such as the POPS. This would include a lot of Hagen stuff such as the Piccolo, Manta Payload, etc, etc. These routines are usually one-off routines and just create a lot of cruft for the user to sort through. On top of this, the routines often contain nothing useful except how to get the data into pandas or some other format. These routines should be the possession of the specific user or group who has need of these.
Instrument libraries that should be retained are those that
What say you?
btw - you can start to see how some of this is beginning to take shape in the NewStructure
branch.
What means OTS?
off the shelf
Totally aggree. Piccolo has nothing to do with aerosols anyway. We might want to use some of it in the example section to help people to parse there data, in particular when it comes to creating the time-stamps things can get rather tricky. Any comments on the mindmap: https://www.mindmup.com/#m:a155ce22409c500133b853081f2f88ed14 In particular, what do you think about the tools foulder. Some of the stuff in there needs to go somewhere else, like conversion. The idea is a tool-kit for very general helper functions?!? Maybe we should hide it better, because it is no used directly? This is actually a more general thing. I often outsource functions to improve readebilty and put them in a separate module which I call tools.py. Any comment on that practice?
Ya...I kind of sorta got into the tools folder and then realized that there was a lot of routines you were using for instrument specific applications. I had started a longer post that commented on these tools, but then ditched it when it became overwhelming. Regarding the tools folder:
thermodynamics.py
and conversion_tools.py
. The contain useful routines for converting between height and pressure. We should keep these but they also reference hagpack
. That needs to be ditched. At this point, I am not sure where to put these?plt_tools
looks to be something specific to how you present data.pandas_tools
looks to be some kind of tool that you specifically usemath_functions
contains one method - gauss
; can we not use the normal distribution in numpy for this?analysis
is...?time_tools
seems to have an implementation in mind.And regarding the mindmup - I am not sure how I feel about this suggestion. atm in the atmPy implies that this is atmospheric. Wouldn't general atmospheric routines such as lapse rate, etc. just go in the main folder?
Regarding tools: I will go through it and clean out anything that is used only by me. However, Its less than you might think, e.g. I am defining plotting routines that use the plt_tools (same with the pandas tools).
Regarding the atmosphere folder: I thought the same thing. However, putting it in the main folder? I thought we might not want any modules in the main folder, just other folders? But it's a good point, let's put it in the main folder and see how it feels ;-)
You might be right about this. Maybe we need a "general" folder?
Regarding plotting and pandas - do we want to lock the user into a particular way of expressing the data returned by the routines? pandas might be good but your plotting routines assume that the user will use matplotlib as the plotting tools; is that what you want?
We might want to go over this folder in person together. Don't get rid of anything on your end with the idea that you are restructuring - this could create some merge headaches down the line. Just let me know what you definitely can get rid of and I will.
pandas is essential to the sizedistribution module. Everyone should use pandas anyway ;-)
Locking people to matplotlib ... good question. However, these plotting functions are just soooooo useful. I would prefer not to get rid of them ... at least not all.
Yeah...but they are essential to you. Is the average user going to find them essential?
Regarding pandas - I think using it is fine. I just think the routine you have in there is not useful to everyone or is easily reproduced in private libraries...
I think we are mixing things up here, lets talk about this another time. Maybe we are getting to much into the detail here for now. I think we have an idea about the general Structure and maybe should go ahead and reorganize things into the folder. We can then focus on content of particular modules. What do you think?
I think that is incorrect - when thinking about structure you are inherently thinking about organization. We are attempting to make the library more coherent. To that end, we need to figure out 1) what belongs and 2) where it belongs. If these plotting and pandas routines are essential to the function of your stuff, then we need to figure out what stuff they are essential to so that we can place them exactly where they need to be. Stuffing everything into a generic tools
folder isn't sufficient. These tools are not used broadly across the library and tools
just becomes the defacto junk drawer of the library. If you want people to use this, then things need to be organized.
1) There belong to where ever I use plotting and pandas. Right now that means everywhere below /aerosols. Right now I am starting to work on radiation therefore I will need them there too. => These functions need to be close to the trunk of our tree
2) I do not expect any end user to use these tools. I would welcome other developers to use them though, although it is unlikely unless someone adopts to my style. It is likely that they will only be used by modules programmed by me. => Make them less visible? /_tools?
3) I will improve organization in the future. However, its supposed to be a toolbox, which are inherently messy? Or is there another concept than labeling a draw and chucking everything in it that kinda matches. I could make subfolders, or just comment more in the code?!?
Hmmmm....I thought I replied to this, but apparently I didn't. I will address these points randomly:
I do not expect any end user to use these tools.
This gets to a point I made above - if you don't expect users to use them, then why are they there? This library is not for developers but for users, no?
However, its supposed to be a toolbox, which are inherently messy?
My thought is that things that are intended for use by others shouldn't be "messy". If it is messy, then the directory hasn't been given enough thought.
There belong to where ever I use plotting and pandas.
I don't understand this. Why do you need them? What do they do that is integral to the functionality we are trying to provide?
I do not expect any end user to use these tools.
This gets to a point I made above - if you don't expect users to use them, then why are they there? This >library is not for developers but for users, no?
They are there because they are used by modules that are used by users.
There belong to where ever I use plotting and pandas.
I don't understand this. Why do you need them? What do they do that is integral to the functionality >we are trying to provide?
They are also used by modules that are used by users. Almost all data in modules written by me is based on pandas DataFrames and Panels, so I came across similar issues in different modules. Do reduce redundancy I am collecting these tools and reuse them.
OK...At some point I will look at the details. I think that the tools
folder will be revisited when I can actually look at how it's being used. I will have much more detailed rec's then.
So, this is what is messed up with the tools folder. I just discovered that you are duping some of my code. You have stokes_number
and stopping_distance
in the tools
folder. These should be removed in put in the aerosol physics folder as far as I am concerned. flow_rate2flow_velocity
should be in a general tools folder and would be better suited in the General
folder. This is kind of a mess...
sure thing!
was just checking out the the new structure branch. I know you are not a fan of very descriptive names and I am a fan of over descriptive names. What about a compromise?
in aerosols:
The last couple of dayes I worked on simplifying sizedistribution and actually splitting it up and making some of the class attributes available out side the class, so it can be used in a more functional form (as you suggested). I am fare from ready though, so it is probably still rather confusing. Anyway I wanted to play around with updating my changes to the structure changes you made in new_structure. BTW, the name of my branch was an accident and renaming it sounded kind of annoying. the name of the branch which has my changes is ht_sb_freeze.
Hmmmm....I haven't checked in for some time...I am testing some things to see if they make sense. I will take a look at this again.
Regarding instr
- really?? Name one distribution that has a directory called instr
that contains instructions.
Don't try to do anything related to this branch - I think that it is going to be tough merging things. I am trying to refactor appropriately where I can.
Regarding instr - really?? Name one distribution that has a directory called instr that contains instructions. I really do not understand your approach of that? Why not use real words like many other distributions. Just go though numpy and matplotlib. Sure they use some wired stuff, but alot are whole words. My opinion is that more descriptive words are increasing the learning curve. Maybe we should come up with a rule. Like:
- do abbreviate if longer than xy
- do not abbreviate if shorter than xy
I suggest that because I have the feeling this topic will come up more often in the future if we don't come up with a rule ;-)
Don't try to do anything related to this branch - I think that it is going to be tough merging things. I am trying to refactor appropriately where I can.
I think we should merge more rather than less, this makes us more familiar with how to handle problems. Also, I want to go ahead with things and would like to avoid to big clashes.
The pythonic mantra is readability. The more I google the more I come accross the opinions: "abbreviations should be avoided at all cost"
So silly compromise solution, instead of instrument or instru what if you called instru "device" instead. It has readability, but still maintains a clear tie to the contents?
sounds good to me!
The more I google the more I come accross the opinions: "abbreviations should be avoided at all cost"
This statement clashes with every repo. When is the last time you saw "utilities" or "library" or "input-output" written out in a module name. Take a look at naming conventions in popular repos. SciPy
does not call the folder fast_fourier_transform_package
- it is fftpack
. I think it is not that you don't want to shorten, it's that you don't want to shorten in a manner that obstructsreadablity.
From PEP-0008 on **module names:
Modules should have short, all-lowercase names
From Python Coding Conventions general comments about naming:
Avoid using names that are too general or too wordy. Strike a good balance between the two.
and example provided
- Bad: data_structure, my_list, info_map, dictionary_for_the_purpose_of_storing_data_representing_word_definitions
- Good: user_profile, menu_options, word_definitions
Long names can (and do) obstruct readability
The instr
discussion seems to be bikeshedding. instrument
is not a problem. The bigger issue is class/function definitions like convert2numberconcentration
or ExtinctionCoeffVerticlProfile
. These are overly verbose (IMO) and can be shortened in a manner that does not obscure their intention.
Nevertheless, this is a discussion for another issue that was called out by Hagen not for the structure but simply for naming conventions. I suggest that this discussion be shifted to another issue that focuses on these. Right now, I am more concerned with the structure as well as testing than I am about naming. I think naming will become more of an issue as we swing around and start examining what the package actually contains. For now, I am going to leave the naming argument...
I knew you would pick the these abbreviation which had been put into stone a thousand years ago and which are used in every programming language. If we come accross these things we of course do not write them in full length!
But as you quote further down. There is now reason to abbreviate: user_profile, menu_options, word_definitions which are longer then instrument or sizedistribution
ok! If its about the structure only, why not finalizing the mindmap and putting it into action?!?
Last thing to say about naming: you have chosen three of the bad examples provided. These are overly vague names and even if they weren't, there would not be any meaningful way to shorten them. I would suggest that things like num and conc and atm are meaningful to parties that use these packages. Surely you didn't struggle with what the library numpy
was intended to do, did you? And we can all agree that when we see a package that has *py
in it, we know that it is likely a python distribution? You are using these every day and don't seem to struggle with it; why do we have to get verbose when we start writing code?
What do you mean finalizing the mindmup? I am hoping that with NewStructure
, I can implement a vision that includes testing as well as some of the cruft removal and restructuring.
But how can I implement my changes if you don't want me to merge?
I thought, if the mindmap is set, it takes an hour to just do the changes.
I suggest we do multiple steps. 1) we create foulders according to the mindmap and copy the files into the orders 2) merge 3) fixing one file, split it up make it nice ... 4) merge 5) next file ... at the end of each step me make sure its still working.
Maybe if you want to do this then we tear the whole thing down and start adding everything back in. What you suggest above sounds...exhausting... I am pretty certain that what you suggest is going to take about a year. The refactor tool in PyCharms makes moving a breeze. Why do we need to go file by file?
Are you still changing your stuff that is in the library now?
I would use the refractor for each of these steps, thats why I thing its not problem to do it in the way I suggested it. In particular step 1) would be a no brainer. And yes I am still working on the libraries almost every day. If you want I can take my current working branch and reorganize it according to our proposed folder structure this weekend. Since nobody else is using our stuff anyway, I can than just pull it back into master and we both start testing our stuff and start fixing thinks. We are in a particularly nice situation right now, since our stuff is not too much entangled. So you can fix your stuff and I fix my stuff. Once we are done with that, we start talking about the next wave of changes. What do you think?
OK. I am now working off master and you will see an update in another 10 min. Pull. No code was deleted or changed but a lot was moved.
Done. Added a folder that contains files that I think should be removed. You may disagree. Let's discuss.
Also, will be adding tests. Let me know if we can close this.
what ever.
please just change instr to instruments and phys to physics and we then close it.
and rad to radiation
everything is lower case what about making General to general
I did. Did you not see it? Feel free to change the general
folder.
Can you get rid of GUIs
folder?
Everyone! :-) give some suggestions about how to rearrange the repository!!!!