GiacomoBoldrini / NN_Jet_Tagging

A preliminary analysis on the power of Machine learning in tagging hadronic jets as either quark jets or gluon jets. We know boosted objects creates jets with a peculiar structure and we are able to separate them in category due to kinematical cuts or polarity of particles inside jets (N-subjettines, energy correlation functions and others) but the problem of classification of gluon and quark jets is still an open and crucial problem for LHC data analysis.
0 stars 0 forks source link

How do I generate my own images from some public data sets? #2

Open mk123qwe opened 5 years ago

mk123qwe commented 5 years ago

such as Top Quark Tagging Reference Dataset https://zenodo.org/record/2603256#.XYM14egzZPY

GiacomoBoldrini commented 5 years ago

Hi mk123qwe, Thank you for taking attention into my project. Sorry for the lack of explanation in the ReadMe but i will answer your question. All these scripts take as input a .root file, output of the delphes detector stage from processes involving some quark or some gluons as output of the madgraph parton level generation (identified by a unique Particle ID ranging from 1-8 for soft light quark and anti-quark and 22 for gluons). Therefore you first need to generate parton, particle and detector level simulation with some MC generator (i used madgraph pythia and delphes) e.g in madgraph: generate p p > g g, generate p p > q q.

However i think you don’t need this stage as your link already provide you with some structured array of variables from the events...so i think you can skip all the scripts about images reading and create your own...i’m sorry but my code does not aim to generate images from a general data format but it is specific from .root arising from delphes!

Let me know if you need some more infos, i did not open your .h5 and atm i cannot...if you’re in trouble we can work out a solution according to what i’ve done in my scripts.

mk123qwe commented 5 years ago

Hi mk123qwe, Thank you for taking attention into my project. Sorry for the lack of explanation in the ReadMe but i will answer your question. All these scripts take as input a .root file, output of the delphes detector stage from processes involving some quark or some gluons as output of the madgraph parton level generation (identified by a unique Particle ID ranging from 1-8 for soft light quark and anti-quark and 22 for gluons). Therefore you first need to generate parton, particle and detector level simulation with some MC generator (i used madgraph pythia and delphes) e.g in madgraph: generate p p > g g, generate p p > q q.

However i think you don’t need this stage as your link already provide you with some structured array of variables from the events...so i think you can skip all the scripts about images reading and create your own...i’m sorry but my code does not aim to generate images from a general data format but it is specific from .root arising from delphes!

Let me know if you need some more infos, i did not open your .h5 and atm i cannot...if you’re in trouble we can work out a solution according to what i’ve done in my scripts.

I am not a physicist,but I'm interested in images and deep learning.I don't know how to get some jets images from these events.Most events are stored in h5 or root.I've noticed that some public data is available on the Internet, but it's hard for me to understand how these data are processed into images. Thank you for your help.

GiacomoBoldrini commented 5 years ago

Hi, sorry for the late reply i hope it's not too late. From your files .h5 description i see that you have acces to the top 200 components (particles) inside the jets. You will need the Pt, the Eta and Phi of each of them along with the Pt,eta,phi of the jet (just mean eta and phi and sum the pt of all the particles) then you can construct your image with the following: ` def create_jet_image(image_vector, bins=100, eta=[-0.8, 0.8], phi=[-0.8, 0.8]):

    etas = np.linspace(eta[0], eta[1], bins)
    phis = np.linspace(phi[0], phi[1], bins)

    im_to_plt = []
    count = 0
    for jet_im in tqdm(image_vector):

        im = np.zeros((bins,bins))
        for i in range(bins-1):
            for j in range(bins-1):
                eta_inf = etas[i]
                eta_sup = etas[i+1]
                phi_inf = phis[j]
                phi_sup = phis[j+1]
                for el in jet_im:
                    if (el[0] > eta_inf) & (el[0] < eta_sup) & (el[1] > phi_inf) & (el[1] < phi_sup):
                        im[i,j] += el[2]
        im_to_plt.append(im)
        count += 1

    return np.array(im_to_plt)`

image vector is a vector as follows: [[eta_1, phi_1, Pt_1], [eta_2, phi_2, Pt_2], ... , [eta_n, phi_n, Pt_n]] where 1,2,...,n are the n particles inside the jet.

im_to_plt is you bin x bin image.

Hope it was helpful