cernopendata / opendata.cern.ch

Source code for the CERN Open Data portal
http://opendata.cern.ch/
GNU General Public License v2.0
666 stars 148 forks source link

Change text and picture resources/vispa #327

Closed robfisc closed 10 years ago

robfisc commented 10 years ago

Dear all,

the current text describing VISPA does not really show what visitors could do with it. And the screenshot is from a deprecated desktop version of the software, not the linked web version.

Could you please update the text under education/resources: vispa (2nd entry)?

Thanks, Robert

Link to new screenshot https://vispa.physik.rwth-aachen.de/pydio/data/public/c22202.php

title Online Analysis of CMS Data with VISPA

body With the VISPA internet platform you can perform physics analyses with CMS public data in the web browser. Begin with the discovery of a boson in an example analysis. Then, you can develop your own ideas and visualize the scientific results.

VISPA is developed at the RWTH Aachen University in Germany and is used for teaching data analysis, e.g. in courses on particle and astroparticle physics for third-year undergraduate physics students.

link to https://vispa.physik.rwth-aachen.de/CERN Start your analysis online

pamfilos commented 10 years ago

@robfisc : Thanks a lot. will update

RaoOfPhysics commented 10 years ago

Thanks for that, @robfisc!

robfisc commented 10 years ago

@pamfilos thanks for the update.

@RaoOfPhysics thanks to you, too, for the initial version.

There is another page, which still contains the old version of the text: http://opendata.cern.ch/collection/CMS-External-Resources

You can reach it if you click on "CMS External Resources" on http://opendata.cern.ch/education/CMS

pamfilos commented 10 years ago

@robfisc @RaoOfPhysics : Changes for this page (http://opendata.cern.ch/collection/CMS-External-Resources) where also made, but DB update is needed for the website to be affected. Will do later today..

robfisc commented 10 years ago

@pamfilos : perfect. Thank you.

RaoOfPhysics commented 10 years ago

@pamfilos: Sorry, I should've caught this earlier. Can you please change "visualize" to "visualise" as discussed in #160.

pamfilos commented 10 years ago

@RaoOfPhysics: Ok. can you also check if any changes are needed for "analysis","analyses",etc that exist on this page, before I update

robfisc commented 10 years ago

@pamfilos I would say, in the first sentence it should be "analyses" not singular.

RaoOfPhysics commented 10 years ago

@pamfilos: Seconding @robfisc's point. That's the only change I think should be made.

robfisc commented 10 years ago

@pamfilos sorry for not noticing this earlier: On the second page pointing to VISPA (http://opendata.cern.ch/collection/CMS-External-Resources), there is a description of the provenance of the used dataset, which seems to be copied from the CMS Hep tutorial ("This tutorial is using a small..."). This text does not fit our project for a couple of reasons

We therefore believe, it is best to describe the provenance on our site as this entry point is too general.

Could you remove this paragraph?

Thanks, Robert

We will add something like this directly to the CMS examples, which we promote:


This example uses a small fraction (50 pb-1) of real CMS data taken in 2011, stored in the ig format, to facilitate simplified analysis (the CMS collaboration board has agreed that this particular dataset can be released for educational purposes). Data source: http://cms.web.cern.ch/content/cms-public-data-samples

katilp commented 10 years ago

@robfisc we are planning to have a metadata field for the vispa record (and other "external" records) indicating the input data. Are the files you use already on the portal under http://opendata.cern.ch/collection/CMS-Derived-Datasets ? If yes, we can point to that directly.

katilp commented 10 years ago

I see that they are not yet on the portal. @tpmccauley are these events on docdb? It seems that for this sample the cms public data page http://cms.web.cern.ch/content/cms-public-data-samples points directly to the tutorial whereas it points to docdb for other samples. We should have them on the portal (otherwise we are just pointing from a page to another for the original files, but do not give them anywhere..)

robfisc commented 10 years ago

@katilp currently, we use the following samples from http://cms.web.cern.ch/content/cms-public-data-samples

I guess the latter is the same as listed on the portal on the link you posted.

Our thinking was somewhat different, though. Since we provide a platform that can in principle be used to analyse any data, it does not make sense to assign such meta data to the link to our platform. It is not a link to a data resource, but an analysis resource. I will mention one already planned, i.e. known to happen, example. As soon as I find the time, I will rerun our filler directly on the 2010B AODs to create data in the pxlio format for our server. As discussed previously, this will enable us to include information that is not present in the CSV nor in the ig files. My favourite example are b-tagging discriminators. In that case, the meta data statement on the portal would get awkwardly complicated and one always needs to remember to keep it up-to-date. More important than this foreseeable addition are the changes that are not yet known or planned.

For these reasons, our idea was to put the statement directly on our site where the data is used. It could look like the snippet I added to my previous post.

What do you think?

katilp commented 10 years ago

@robfisc Ok, I see. Indeed, it does not then make sense to point to any particular record. It would be useful to explain how you reprocess the data briefly in the input metadata field. Could you provide such generic information (i.e which formats of CMS public data you take and with which tools you use to reformat them)?

robfisc commented 10 years ago

@katilp is this something you had in mind?


VISPA uses the IO format of the PXL library, which is a class collection for advanced level analysis in high-energy physics experiments. Public CMS data in CSV, ig, or AOD formats are converted to PXL class objects directly (CSV), or using the igfiles (ig) and the CMSSW (AOD) libraries.

first "PXL" links to https://vispa.physik.rwth-aachen.de/pxl "igfiles" links to https://github.com/tpmccauley/igfiles "CMSSW" links to https://github.com/cms-sw/cmssw