Eben60 / Mendeleev.jl

A Julia package for accessing chemical elements data.
https://eben60.github.io/Mendeleev.jl/
Other
19 stars 0 forks source link

Mendeleev.jl - quo vadis? #1

Closed Eben60 closed 1 year ago

Eben60 commented 2 years ago

This package is thought as an addition or replacement for PeriodicTable.jl using the data from mendeleev python package.

Though work-in-progress yet, in it's current state it can already be used as a plug-in replacement for PeriodicTable.jl. The package installable code is based mostly on the PeriodicTable.jl code, but the Element_M struct code and the data (the files Element_M_def.jl and elements_data.jl resp.) are generated by a separate module from the SQLite database which was shamelessly taken from mendeleev -- @lmmentel, thank you for your work!

The Number, Symbol & Name indexing of ELEMENTS_M are the same as in PeriodicTable, the elements representation the same, too. All fields of PeriodicTable Element are supported, in addition to some 60+ fields from mendeleev - see mendeleev documentation . mendeleev database is actively maintained, and the data seem to be taken over from reliable sources. Updating Mendeleev.jl from the database is a matter of running a script.

The compiled code is static like in PeriodicTable, and using Mendeleev takes just 0.5 s on my modest i5 Notebook (0.13 s with PeriodicTable pre-loaded, thus further improvement possible), and takes 15 s for pre-compile during the installation.

I have started the package just for fun and as an excersise in Julia. Now - what can we do with it?

In the last case I may announce it, but I am not sure about registering the package.

Now I would humbly ask the PeriodicTable.jl author and contributors for their opinion. @rahulkp220, @carstenbauer, @Gregstrq, @stevengj - what do you think?

@Gregstrq - you data collection on isotopes could be put into a package in a similar way. And if somebody (@lmmentel ? :wink: ) imports your collection into a database, a corresponding Julia package can be created based on the code of this package at a minimal cost.

lmmentel commented 2 years ago

Hi @Eben60 cool initiative. I'm happy to look into adding isotope data into elements.db in mendeleev, any chance you could export it into csv, parquet or something like that?

Eben60 commented 2 years ago

@lmmentel thank you.

If your question about getting isotope data into elements.db, then better ask @Gregstrq.

If it is about getting them from elements.db into Mendeleev.jl: currently I produce the static Julia code from your database with the intermediate of DataFrames.jl. Concerning isotopes, I think however it is better to get them into a separate package (except maybe for some basic data on stable isotopes).

lmmentel commented 2 years ago

Thanks, I'll continue the discussion in the PR against PeriodicTable.jl

Gregstrq commented 2 years ago

I have started the package just for fun and as an excersise in Julia. Now - what can we do with it?

  • It can be merged into PeriodicTable
  • Or it can be a separate package within JuliaPhysics
  • Or it can just stay with me as the author and maintainer.

There is also a question about data storage. It would be grate somehow to store the data together with units, but for that we need a Julia solution, which would support Unitful. For one, we can store the tables into separate DataFrames. I think, there are some routines in DataFramesMeta, which allow to emulate SQL. Interlinking between the tables via the primary key can also be facilitated with the tools of DataFrames and DataFramesMeta.

There also seems to be an all-Julia database, but I don't know, wether it supports Unitful.

lmmentel commented 2 years ago

It seems that a solution might be using an actual db. SQLite seems like a good solution here since it can store tables and is language agnostic. I haven't used it myself but I saw that thee is e.g. SQLite.jl which allows for talking to the db. If you want sth to test against I can recommend elements.db which is what I use to store data for mendelev.

Eben60 commented 2 years ago

It looks like there is no point in merging with PeriodicTable. The latter has a concretely defined scope, i.e., to be a lightweight package with static data.

@Gregstrq , Mendeleev.jl is already a usable package, containing almost all data from both PeriodicTable.jl and Mendeleev by @lmmentel (some minor changes deliberately done, i.e. due to revision), and still a lightweight package with static data. Here the package load timing for both:

julia> @time_imports using PeriodicTable
    1.3 ms  ConstructionBase
    495.1 ms  Unitful
    18.1 ms  PeriodicTable

julia> @time_imports using Mendeleev
    1.5 ms  ConstructionBase
    506.3 ms  Unitful
    55.2 ms  Mendeleev

I think some 50 ms extra load time should be acceptable. I still have to add electronegativities functions, but don't think that would add much load time.

Eben60 commented 2 years ago

If you want sth to test against I can recommend elements.db which is what I use to store data for mendelev.

Actually using SQLite.jl to extract data from your elements.db was exactly what I did. Afterwards the data are converted into the static source of Mendeleev.jl. Updating Mendeleev.jl if you update your elements.db can thus be done by just re-running the script.