CalebBell / chemicals

chemicals: Chemical database of Chemical Engineering Design Library (ChEDL)
MIT License
186 stars 36 forks source link

Files to be ported #4

Closed CalebBell closed 3 years ago

CalebBell commented 4 years ago

What was moved over, was moved over; and that is pretty good.

CalebBell commented 4 years ago

I am no longer thinking it makes sense to include the law.py file. Governments are constantly updating their data files, but I have never updated any data I had. I could see it as a separate library following this style in the future if it became important to have though.

yoelcortes commented 4 years ago

Good point, although it wouldn't hurt leaving it there since we are lazy loading everything. I can't see myself needing use it, so really up to you.

yoelcortes commented 4 years ago

I'll take care of heat_capacity.py

yoelcortes commented 4 years ago

I'll take care of volume.py

CalebBell commented 4 years ago

Hi Yoel, Can you wait a bit? I am mostly done with volume.py, but am also still trying to catch up reactions and heat capacity.

yoelcortes commented 4 years ago

Sure! Let me know what you'd like me to work on. I'll can probably start working on it at night.

CalebBell commented 4 years ago

Hi Yoel, The big thing I have worked on this week has been the numba interface. Admittedly this is more on the fluids/other library work side of things. But the same stuff can apply to chemicals. Here are some comments in that repository. https://github.com/CalebBell/fluids/blob/master/tests/test_numba.py#L383 I would really appreciate it if you could try out the interface, and if you knew the answers to any of the comments I have there. The biggest thing I don't think will ever be changed by numba is changing the return type of a function;. The get_methods/AvailableMethods thing I developed does not play nice with numba. I don't know what I am going to do about those functions.

I found an interface to most of scipy.special with numba, which is awesome although it only covers that so far (https://github.com/numba/numba-scipy). Some of the special functions, like lambertw, aren't wrapped with Cython and don't seem to work.

Another area that's in rough shape that I don't know when I will get to is safety.py. I will get to it eventually but it's there if you want to work on it. I uploaded the latest files. There is one group of methods - Ceiling, STEL, TWA, Skin - which has only one data source for Canada only. But this is a critical piece of information. It would be nice to have more sources for those pieces of information too, although that's more of a post-initial release thing too.

On the fire side, there are a couple poorly documented/tested methods (UFL_mixture and LFL_mixture); I never got into it too much. Maybe they are suitable for removal, or they require further literature search.

safety.txt test_safety.txt

yoelcortes commented 4 years ago

Hi Caleb,

Regardless of whether we are looking to get functions to support numba, I would recommend to separate the "get_methods" behavior. There is beauty in simple function behavior and having two functions, one for using finding the methods available and one for returning the actual value, would probably be good enhancement on the long run. It would simplify the documentation as well as make it more computationally efficient (albeit just on the nanosec level jaja).

The reason why I bring this up is because I'm starting on the safety.py module, and I'd like to do just that. If you agree on this, I'll also take the time to separate out this behavior in all other methods, including Tc, Pc, Tb, Hfg, etc.

I'm planning on making these functions as:

getmethods{property} -> get_methods_Tc

So that when users write get_methods, they can see all the available "get_methods" functions.

Please let me know if you agree, Thanks!

CalebBell commented 4 years ago

Hi Yoel, I did this procedure for the methods in fluids. It wasn't just numba, which still isn't supported for those methods. It was just a desired to change a behavior which keeps causing trouble.

I am happy with the get_methods_Tc convention, although in fluids I was going for Tc_methods. I didn't have the convenient lists of methods in fluids already though.

Thanks for paying attention and staying alert, I was thinking it would be good to do to chemicals but it was not an immediate priority for me.

Cheers, Caleb

yoelcortes commented 4 years ago

Coolio, in this case, I'll also use the {property}_methods convention and change the lists of methods to {property}_all_methods, and take care of these changes in the tests. Thanks for the quick reply!

CalebBell commented 4 years ago

Hi Yoel,

That was a massive commit you pushed - thank you so much!

I feel I have slowed down getting a first public release a lot, with the methods things and numba but hopefully a higher quality library is the result.

I took a look at the identifiers module in thermosteam and found quite a few changes.

I was wondering if you could describe what your goals for you changes are?

For me, I am pretty happy with the existing identifers.py file in thermo except with zero loading of any chemicals by default, and I'm not sure I want the mixture composition data to be there either. The documentation could be better though.

Honestly my biggest concern is handling the data files, which I have a bunch of scripts for in https://github.com/CalebBell/chemical-metadata but it is far from a library or easy to use/update.

yoelcortes commented 4 years ago

Hi Caleb,

No worries, I've also slowed down quite a lot too due to many presentations and workshops I've given recently, but I'm still hoping we can make a prelease by August. I'm sure this be a high quality library and, with the right documentation, possibly a game changer compared to other libraries.

As for the changes I've made and why:

Things I changed, but could be added back:

There are also some minor algorithm changes in getting metadata that made it faster, and although I don't think they would be incompatible with the previous identifiers, it would be nice to test this somehow.

I haven't had a good look at the chemical-metadata package, so not sure how it fits in with chemicals, but I'll have a closer look at it some time this week.

Thanks!

yoelcortes commented 4 years ago

I'll go ahead and focus on adding the UNIFAC and activity coefficients modules. They won't be object oriented, but the activity coefficient functions will take in optional parameters that should be cached elsewhere.

CalebBell commented 4 years ago

Hi Yoel, It's great that you want to add UNIFAC! It is a really great model. I don't want chemicals to be a monster library though. I wasn't planning on having any activity coefficient models or equations of state in it; just pure chemical data, correlations for chemical properties, and the chemical metadata. I fear trying to include it will bring out scope creep. Numba has been a huge amount of work, and is not done. I am open to collaborating on another library for the models - possible with the flash algorithms included, possibly in a separate library - but I want the release of chemicals done and out the door first. By the way, I got my first bug report into numba: https://github.com/numba/numba/issues/6007 Cheers, Caleb

yoelcortes commented 4 years ago

Good idea! This would really shorten the effort to get this out. I'd like to get Thermosteam running on chemicals too. So other than the identifiers (which I believe you're already taking care of), what else is there to do regarding code?

If there is nothing left to code, I can start working on the docs. I'll post another issue on the weekend and propose a draft of the docs and next steps.

Thanks!

yoelcortes commented 4 years ago

@CalebBell, just saw your new branch/commits, looking good. I'm taking note of the changes you made to heat_capacity (try IS_NUMBA except is preferable, and not using properties for simplicity and speed).

I'm still working on the docs. Hopefully I can make a commit and ask some questions soon. Sorry for the delay!

yoelcortes commented 4 years ago

Hi Caleb

I added early draft of the docs and the readme file. The readme file has TODOs which are mostly my comments/questions for you. The docs has similar layout to BioSTEAM and fluids. I added titles for possible tutorials. I'm also proposing we have Developer's guide to show how people should run pytests and keep the load speed fast (as well as other standards we have for the package). A code of conduct could be useful for future reference too, but if you feel like its unnecessary then let's not add it (its OK either way).

I'm thinking it would be nice to get a logo for ChEDL so you can put it in all your related packages. Have had any thoughts on this? I personally think BioSTEAM's logo is a little big and complex (a coworker made it for fun on their free time, and we sort of just ended up using it everywhere), so maybe you can shoot for something simple like pytorch's logo, or something pleasing to the eye like the logo here https://chemics.github.io/

So I tested the docs using a sphinx's "make.bat" file. Do you have a preferred way to make the docs (and tests them)? Also, would you prefer to have the tutorial as rst files, or would you be open to a Jupyter notebook? I feel that jupyter notebooks really accelerate the process of making a tutorial.

Thanks!

CalebBell commented 4 years ago

Hi Yoel, Thank you again for your hard work and support. A developers guide is something I have been meaning to do although it will always be a work in progress. This will be especially good from my point of view for documenting those things I myself struggle to remember :) I am find with a Jupyter note book tutorial, but I would like an automated way to update their results. I am also trying to get that for fluids which has some jupyter notebook examples. I wouldn't want the tutorials to get as long as say fluids's tutorial in a single file though - I find large notebooks are really easy to get lost in. I don't have any thoughts on a logo, I'm not really an arts person. I'll think about it?

I would prefer a python script for generating the docs to a make.bat file, as I run Linux. I would really like the tests to run the docs as well.

yoelcortes commented 4 years ago

Thanks for the quick reply, Caleb. Since this package doesn't have much plotting or diagram features, I'll start making the tutorial in rst files so we can doctest them. There are some jupyter notebook extensions that help you automatically test code, but they require a specific format that is not too pleasing to the eye.

I'll open a new issue on the developer's guide to note a list of the things we need (feel free to edit the post).

I haven't looked into using a python to generate the docs, so if you already have one you could upload it with some comments on how to run it.

Thanks!