Template for bound - Githubissues

micheledoro commented 2 years ago

I have added in the /templates/ folder a first proposal for the template file for the bounds. There are severa things to agree upon:

what columns do we put?
what metadata do we put?

I made a proposal: we can prepare one file per target per paper, including different channels (e.g. Segue in this paper for bb, tt channel). However, changing meta and columns we can change the number of files we generate (e.g. we can have one file per paper, one file per bound, etc). Think about it. It is clear that we better start with a final template for the data, so that later on importing would be easy.

moritzhuetten commented 2 years ago

Great, thanks! I actually propose one file for each bound/curve (so only two columns, mass and sigmav), because when sampling by hand it is impossible to sample various curves at the same x-axis values. To then later plotting curves together (spline-interpolation) transformation on the same grid might be useful anyway. I propose to add the follwing metadata:

{reference: "Title of the paper"}
{doi: "DOI of the paper"}
{arxiv: "arXiv ID of the paper"}
{instrument: "Instrument name"}
{year: "year of publication"}
{month: "month of publication"}
{source: "Target name"}: I propose to follow the phrasing of our paper, e.g. "MW Inner Halo".
{channel: "channel name"}: I propose to use tex format, such that the meta data can be directly used for nice plotting/displaying of the channel.
{confidence "Confidence level as fraction, i.e. 0.95" }
{dmfraction: "fraction of branching channel"}: is this really useful/needed?
{figure: "Description"}: I propose to refer to the paper figure from which the data was extracted.
{comment: "Any comments on the result"}
{status: "Status of file"}

For the file names, I propose: year_instrument_target_channel.ecsv

digits/precision? I propose that seven digits are enough.

I put an example file according to this proposition in bounds/hess/2016_hess_gc_bb.ecsv.

moritzhuetten commented 2 years ago

And propose to add # %Part of https://github.com/moritzhuetten/DMbounds. Licensed under a XXX license - see LICENSE.rst line at top of file. We also should agree which license we want to use for the project.

micheledoro commented 2 years ago

You convinced me for the two columns.

As for the metadata, all is ok expect I would not put the month of the paper, do you think it's needed in case of multiple papers per year? In that case I would use e.g. 2020b.

As for the file name, I propose instead

instrument_year_target_channel.ecsv

We can also define the name of experiment as (capital in the plots, small in the file names)

MAGIC, CTA, SWGO, HAWC, LHAASO, VERITAS, WHIPPLE, LAT

and channels (latex in the plot, text in the file name):

bb,tautau,ee,tt,WW,gg,Zg,... b\bar{b},\tau^+\tau^-,e^+e^-,t\bat{t},W^+W^-,\gamma\gamma,Z\gamma,...

We should also discuss the fact that limits could include other factors such as boosts, sommerfeld effect, different ipotesis, so that probabily the file name should also contain, in case

instrument_year_target_channel_(text).ecsv

moritzhuetten commented 2 years ago

Ok, changed template in f88e31c. I think we even don't need the 'b' after the year for unique filenames, it was just to rank results by their time of publication (So between a January and December publication), but probably don't need to be so precise. I agree with the "other factors". Inside the file, we can put it in "comment"?

For the in-file experiment names, we are not constrained to simple capitals, no? Can also write H.E.S.S. and/or \textit{Fermi}-LAT?

moritzhuetten commented 2 years ago

I propose the following nomenclature for targets: Filename - source in meta data

gc - MW Centre gc - MW Inner Halo gc - MW Outer Halo draco - Draco dSph ursaminor - Ursa Minor dSph sagittarius - Sagittarius dSph canismajor - Canis Major dSph willman1 - Willman I dSph sculptor - Sculptor dSph carina - Carina dSph segue1 - Segue I dSph booetes1 - Bo\"{o}tes I dSph comaberenices - Coma Berenices dSph fornaxdsph - Fornax dSph ursamajor2 - Ursa Major II dSph triangulum2 - Triangulum II dSph cand. segue2 - Segue II dSph canesvec1...2 - Canes Ven I - II dSph hercules - Hercules dSph sextans - Sextans dSph draco2 - Draco II dSph leo1...5 - Leo I - V dSph reticulum2 - Reticulum II dSph tucana2 - Tucana II dSph tucana2...4 - Tucana III - IV dSph cand. grus2 - Grus II dSph cand. 1fglJ23470710 - 1FGL J2347.3+0710 (and so on) galplane - Galactic Plane Survey m15 - M15 ngc6388 - NGC 6388 m33 - M33 m32 - M32 wlm - WLM abell2029 - Abell 2029 perseuscluster - Perseus (Abell 426) fornaxcluster - Fornax (Abell S0373) comacluster - Coma (Abell 1656) multidsph - N dSph galaxies

and so on.

moritzhuetten commented 2 years ago

Hi, sounds all good, template updated. I leave dmfraction for the time being. We should also add the observation time to the metadata, as we nicely collected it in our paper.

We did not consider decay so far. But I think we also should include in our repo, no? I think the best then would be just to extend the file naming to

instrument_year_target_process_channel_(text).ecsv

where process is either ann or dec. in the file, it is implicitly clear from the units what's considered.

micheledoro commented 2 years ago

Great! We forgot the channel! Let's start!

moritzhuetten commented 2 years ago

Ok. I just discovered some kind of bug in the ecsv metadata description!? Sometimes, reading

from astropy.io import ascii
data = ascii.read("filename.ecsv") 
data.meta['keyword']

fails. It seems to depend on some "bad" characters inside any of the keyword. For example, it did not like # - {channel: "\gamma\gamma"} (but # - {channel: "gammagamma"} was ok), or # - {confidence "Confidence level as fraction, i.e. 0.95" } (# - {channel: "Confidence level as fraction"} was ok). Very strange.

micheledoro commented 2 years ago

I propose to keep a file for the acronyms, I created three:

Let's modify and update the files directly when needed

micheledoro commented 2 years ago

I have generated here the bb decay DM MAGIC of Ninci. In principle I can add the 68 and 95 bands but I have two doubts

do we want to add the 68 or 95 CL?
how can you take the value at the same energy of the central value with Plot Digitezers?

moritzhuetten commented 2 years ago

Hi Michele, thanks!

The 68 and 95 CL bands are usually given only for the sensitivity, the actual limit is only a single curve, no? BTW, their dSph limits VERITAS provides comes with some 68% uncertainty bands from the J-factor systematic uncertainty, I included them, see e.g. here.
Regarding the sampling at the same points, no, that is actually directly possible I think, but no problem with resampling on the same grid after some interpolation/spline as I mentioned already. That's how I did it already for redrawing data-thieved bands.

micheledoro commented 2 years ago

Also, if the arXiv is not available should be leave it empty or provide alternative links? For example, proceedings are sometimes not appended in the arXiv

micheledoro commented 2 years ago

I saw that in one ecsv file you write e.g.

# - {Source:           "Segue I dSph"}
# - {channel:          "b\bar{b}"}

However, I understood that it is here where we have to put the shortnames e.g.

# - {Source:           "segue1"}
# - {channel:          "bb"}

The reason is that when we use the metadata to - say - plot all bb limits or all segue1 limits, we better search for the shortnames than the full names with spaces and latex symbols. Let me know

micheledoro commented 2 years ago

Also, there are at least two cases (MAGIC draco first paper and MAGIC willman1 first paper) in which we have limits for points, I have modified accordingly channel list. I have also put order in the target list, can you please check?

moritzhuetten commented 2 years ago

Also, if the arXiv is not available should be leave it empty or provide alternative links? For example, proceedings are sometimes not appended in the arXiv

Hi, then just leave it blank? At least PoS has a DOI, no? Alternatively, we could change the arXiv or DOI keyword to a general URL keyword?

moritzhuetten commented 2 years ago

I saw that in one ecsv file you write e.g.
# - {Source:           "Segue I dSph"}
# - {channel:          "b\bar{b}"}
However, I understood that it is here where we have to put the shortnames e.g.
# - {Source:           "segue1"}
# - {channel:          "bb"}
The reason is that when we use the metadata to - say - plot all bb limits or all segue1 limits, we better search for the shortnames than the full names with spaces and latex symbols. Let me know

Ah, I see your point! Indeed, it might be much smarter to use the short names for both channels and sources inside the files, and to match with a human-readable/LaTeX-Format by the legend files. Also, this solves the issue that ecsv sometimes has problems with special characters. So let's change it (I can do)!

moritzhuetten commented 2 years ago

Also, there are at least two cases (MAGIC draco first paper and MAGIC willman1 first paper) in which we have limits for points, I have modified accordingly channel list. I have also put order in the target list, can you please check?

Looks good, thanks! I will change the lists to be machine readable, according to above discussion, ok? (Maybe also ecsv?)

moritzhuetten / dmbounds

Template for bound #1