ynikitenko / lena

Lena is an architectural framework for data analysis
Apache License 2.0
4 stars 1 forks source link

Share your story #1

Open ynikitenko opened 3 years ago

ynikitenko commented 3 years ago

Lena exists for more than a year as an open-source (free software) project, and it would be great to hear your story if you use it!

How did you know about Lena? What kind of data analysis do you do? What parts of Lena do you find most useful? What other packages/languages do you use?

Your feedback is most valuable for the project's development.

ynikitenko commented 3 years ago

Inspired by this issue in a different project.

ynikitenko commented 3 years ago

I do data analysis for a very long time (15 years or so); I started with C++, but then used Python more and more. Rather often I faced the problem that when my programs grew larger, it became harder and harder to support and extend them. I knew about Django framework for Web development and used that, and decided to create a framework for data analysis (I mean an architectural framework, which tells me how I should write my program better). I found that MVC architecture did not suit well for data analysis, and got to Lena sequences. They appeared general enough: for more than a year I've been using them, and had to change nothing in the core classes (apart from some details for not-so-important and rare elements). This I consider a success (usually I would turn to another architecture/tools by that time).

I analyse data in a neutrino physics experiment Double Chooz. I'm working on a PhD thesis "Determination of the direction to a source of antineutrino via inverse beta decay". I work at the Institute for Nuclear Research of the Russian Academy of Sciences (Moscow). (The framework is independent of my institution and is developed in my free time).

I love it how easy it is to create many plots in Lena. It seems that presentation is one of the most important aspects in data analysis (when making more plots about the data, I even came to better models than I considered originally). Probably there are more things I like: I continuously add new ones as I need them in my analysis.

This may be strange, but I don't use any neural networks or other popular stuff; I use traditional statistics. Neither use I pandas, numpy or scipy (though I like scipy and used it some time ago). Usually I write my own algorithms or use standard Python libraries. Recently I had to make difficult fits (about 30 parameters for 3-dimensional data), and found that ROOT framework (in C++), namely its part RooFit, works great for that (and I use C++ for very intensive jobs - to later pass higher level results to Python and Lena). I use Linux (and only that), currently Arch Linux, and all its standard tools when needed (small bash scripts, Makefiles, etc.).