NLeSC / TEAM2018

This is the repo for the 2018 TEAM sprint
2 stars 3 forks source link

EOSCPfL #22

Open HannoSpreeuw opened 6 years ago

HannoSpreeuw commented 6 years ago

The outcome of the EOSC Pilot for LOFAR project could also lead to a paper (and a demo).

The goal of this project is to unlock the LOFAR Long Term Archive (LTA). It contains > 28 PB of LOFAR observations as visibility datasets with almost zero scientific output. Almost all astronomical science starts with sky images, so these datasets have to be calibrated and imaged. But this is very labour intensive, i.e. there are a lot of steps in processing uncalibrated visibility datasets into a sky image that can be used for publication. That is a main reason why the LTA is hardly used. And that is a waste of taxpayers' money. We want to bridge this gap by automating the processing and taking care of 70% of the work of the astronomer. So by selecting an observation from a webportal and starting processing in just a few mouse clicks, a pretty reasonable sky image will be produced. For an astronomer this should be enough to decide if it contains interesting science. In that case he/she can fine tune the processing to make a close to perfect sky image.

In the last sprint, it was shown that we could select observations directly from the archive and start processing them to coarsely calibrated compressed datasets with just a few mouse clicks.

However, it turned out that this did not include "staging" i.e. copying the observation from the LTA tapes to disk. Also, it did not include imaging the coarsely calibrated compressed datasets. These steps have to be added.

We want to add these steps, show that we can bridge the gap and unlock the LTA. This would make the LTA a much more attractive astronomical resource.

romulogoncalves commented 6 years ago

@HannoSpreeuw is this proposal still up for Team2018 November Sprint?

HannoSpreeuw commented 6 years ago

Yes. Very much so. But there also remains some software work to be done. Which I hope to complete before the sprint.

HannoSpreeuw commented 5 years ago

But there may be an issue. The software is very immature and was mainly developed to explore new technologies. We cannot show any impact of the software on potential users yet. I.e. did these applications really unlock the LOFAR LTA? It is way too early for that.

romulogoncalves commented 5 years ago

@jmaassen it seems we could overlap with Progress, could you elaborate a little a bit more on your idea?

@HannoSpreeuw it sounds like we go to do Software development, is that right?

HannoSpreeuw commented 5 years ago

What is Progress? Do you mean PROCESS?

HannoSpreeuw commented 5 years ago

@romulogoncalves yes, it looks like we're going to do Software development.

romulogoncalves commented 5 years ago

PROCESS is the name of the project.

romulogoncalves commented 5 years ago

@HannoSpreeuw is there project hours for this work? If yes, which one?

HannoSpreeuw commented 5 years ago

Not on EOSCPfL any more. But on PROCESS, hopefully. Please ask @jmaassen.

jmaassen commented 5 years ago

Writing the hours on PROCESS is fine, provided we can also use it as a demo for this project.

romulogoncalves commented 5 years ago

@HannoSpreeuw were all the goals achieved? Could you summarize in 3 lines what was achieved? Can we consider it done and close this issue?

HannoSpreeuw commented 5 years ago

The dockerfile for the container that deploys the webportal was almost completed by Adithya. There is just one bug I need to fix.

It does not include staging yet, but that will be worked on within PROCESS. This is a major task.

Progress has been made wrt a demo: the reduction of an observation of at least 100 GB. I believe that all the issues have been solved here. Waiting for my contact person at ASTRON to provide me with a suitable observation of a calibrator. I need at least 20 subbands, but preferably not a lot more than 100 GB.

We did not have time to work on the paper. Our goal is to submit a paper before spring.

romulogoncalves commented 5 years ago

@HannoSpreeuw what is the current status?

HannoSpreeuw commented 5 years ago

The Docker build system for the container that can deploy a web service for one click processing of observations from the LOFAR LTA has been completed: https://github.com/process-project/lofar-lta-one-click-processing

@rvanharen and me want to write a software paper about this before spring. So this will be about unlocking the LOFAR LTA and bridging the gap between data and astronomers by means of one click processing. Perhaps we should get more people involved, but we didn't ask anyone else yet.

Also, the demo has been performed. It is actually a working example with less than 100 GB of data, but anyone can now calibrate a small dataset from a Singularity container.

romulogoncalves commented 5 years ago

@HannoSpreeuw I think the major goals have been achieved. For the paper a new issue should be opened, and then managed within the project, i.e., it is out of the scope of the TEAM sprint.

What do you think? If you agree, I will close this issue.

HannoSpreeuw commented 5 years ago

No, I don't agree. Writing a paper was explicitly mentioned as a goal when I created this issue. So that is not a new goal.

romulogoncalves commented 5 years ago

Good point, then it remains open