scikit-hep / aghast

Aghast: aggregated, histogram-like statistics, sharable as Flatbuffers.
BSD 3-Clause "New" or "Revised" License
17 stars 8 forks source link

Adding boost-histogram to tutorial #39

Open HDembinski opened 4 years ago

HDembinski commented 4 years ago

Is it possible to add boost-histogram to the tutorial on how to use Aghast? People are asking me how to convert ROOT histograms to boost-histograms and it would be nice to point them to the tutorial.

LovelyBuggies commented 4 years ago

@HDembinski After discussing with Henry, we plan to add a notebook to the tutorial to deal with it.

LovelyBuggies commented 4 years ago

A more detailed expansion to this.

LovelyBuggies commented 4 years ago

@jpivarski Hi, Jim. I am testing the integration between boost-histogram and aghast. There is a tricky problem. Remember that I have pulled a request to deal with the keyerror - 'sumw2' problem? Now, it is broken when I am testing boost-histogram, as I am NOT using the latest aghast (version without my fix). Could you please tell me how to run the latest aghast when testing boost-histogram?

jpivarski commented 4 years ago

Outside of any Aghast or Boost-Histogram directory, run

pip uninstall aghast

enough times to delete all versions of it in your pip repositories. (For me, pip is a directory inside of Miniconda because I always have Miniconda enabled. If it's a system directory for you, you'll have to sudo.)

Then, in the git directory for Aghast, do

pip install .

to install whatever has been git pulled into that directory in your pip area. Again, if you're using a system-wide pip, you may need to sudo. Also, if you've been using user-space pip in your ~/.local directory, then you may need --user.

LovelyBuggies commented 4 years ago

@jpivarski Thanks. I am using aghast's conda env as my jupyter kernel -- aghast kernel -- to test my notebook in boost-histogram. Specifically, I am testing the following code:

h = bh.Histogram(bh.axis.Regular(50, -3, 3))
h.fill(np.random.normal(size=1_000_000))
ghastly_hist = aghast.from_numpy(h.to_numpy())
ghastly_hist.dump()
root_hist = aghast.to_root(ghastly_hist, "root_hist")

There are 2 situations:

  1. When I follow the conda installation guide at https://github.com/scikit-hep/aghast/tree/master/python#manual-installation, and run the code in Python CLI at aghast/python directory, I got the following message:
>>> ghastly_hist.dump()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/nino/Documents/GitHub/aghast/python/aghast/interface.py", line 322, in dump
    file.write(self._dump(indent, width, end))
  File "/Users/nino/Documents/GitHub/aghast/python/aghast/interface.py", line 9375, in _dump
    _dumpeq(self.counts._dump(indent + "    ", width, end), indent, end)
  File "/Users/nino/Documents/GitHub/aghast/python/aghast/interface.py", line 8047, in _dump
    _dumpeq(self.sumw2._dump(indent + "    ", width, end), indent, end)
AttributeError: 'NoneType' object has no attribute '_dump'
  1. When I use aghast kernel in boost-histogram's jupyter lab, the keyerror issue still exists.
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-3-36362cc8a9a5> in <module>
      3 ghastly_hist = aghast.from_numpy(h.to_numpy())
      4 ghastly_hist.dump()
----> 5 root_hist = aghast.to_root(ghastly_hist, "root_hist")
      6 root_hist

~/anaconda3/envs/aghast/lib/python3.8/site-packages/aghast/__init__.py in to_root(obj, name)
     25 def to_root(obj, name):
     26     import aghast._connect._root
---> 27     return aghast._connect._root.to_root(obj, name)
     28 
     29 def from_root(obj, collection=False):

~/anaconda3/envs/aghast/lib/python3.8/site-packages/aghast/_connect/_root.py in to_root(obj, name)
    134         sumw = obj.counts[tuple(slc)]
    135         if isinstance(sumw, dict):
--> 136             sumw, sumw2 = sumw["sumw"], sumw["sumw2"]
    137             sumw2 = numpy.array(sumw2, dtype=numpy.float64, copy=False)
    138         else:

KeyError: 'sumw2'

I believe if I can use aghast kernel successfully in boost-histogram, it will also be okay by using boost-hist kernel. But now, I don't think I am using the latest version. Do you have any idea to deal with them?

P.S. I always use sudo with -H (if it has warning).

LovelyBuggies commented 4 years ago

Hint: What I have done:

cd aghast/python  
conda env create -f environment-test.yml -n aghast 
conda activate aghast    
python setup.py install    
python -m ipykernel install --name aghast 
python3
# situation one
conda deactivate
cd ../../boost-histogram
jupyter lab
conda env create -f dev-environment.yml -n boost-hist      # dev-env yaml has no aghast
conda activate boost-histogram
# go to the aghast/python directory and setup like what aghast
python -m ipykernel install --name boost-hist
# use boost-hist as notebook kernel and restart
# situation two
henryiii commented 4 years ago

Why does the environment.yml install aghast from conda-forge? Then you'll have the aghast conda package, rather than the latest master.

https://github.com/scikit-hep/aghast/blob/69affd901460eec2c14683e4ce0c7b830f96a63a/python/environment-test.yml#L16

I think pip: ["."] would be better.

Anyway, I think what you need to do here is conda uninstall aghast before you python setup.py install (and I would always use pip install . instead of python setup.py install)

LovelyBuggies commented 4 years ago

Why does the environment.yml install aghast from conda-forge?

Sorry for that typo, I'll remove that.

I would always use pip install . instead of python setup.py install

Okay, let me have another try.

HDembinski commented 4 years ago

Hi, what is the status of this?

I looked into these two tutorials:

https://github.com/scikit-hep/boost-histogram/blob/develop/notebooks/aghast.ipynb https://scikit-hep.org/scikit-hep-tutorials/content/aghast.html

but both do not show how to transform a boost-histogram to a ROOT histogram directly. Instead there is a lossy conversion from ROOT to numpy to Boost Histogram and vice versa. The tutorials are both called "Aghast tutorial", but there is no Aghast used. Isn't the point of Aghast to allow direct loss-less conversion between various histograms?

henryiii commented 4 years ago

@jpivarski hasn't had time to work on Aghast since I started boost-histogram, so it doesn't read boost-histogram properly yet. If he doesn't get time by this milestone, @LovelyBuggies and I will be working on implementing this in Aghast. These tutorials will be updated when direct conversion is possible.

You are right, it is 100% the final intention, it's just not fully implemented.