zooniverse / planet-four

Identify and measure features on the surface of Mars
https://www.planetfour.org/
Apache License 2.0
2 stars 0 forks source link

Uploading and processing new HiRISE images #166

Closed mschwamb closed 7 years ago

mschwamb commented 7 years ago

We're preparing for the next set of images we'd like to put on the site. @astopy we'd want uploaded and cut into tiles:

ESP_040246_0935 ESP_039969_0935 ESP_039824_0935 ESP_039547_0935 ESP_039468_0935 ESP_038822_0935 ESP_038625_0930 ESP_038492_0935 ESP_038215_0935 ESP_038149_0935 ESP_038110_0930 ESP_037964_0935 ESP_040311_0940 ESP_040193_0940 ESP_037977_0940 ESP_037976_0940

@astopy when you have time could you run the processing and upload scripts?

adammcmaster commented 7 years ago

@mschwamb It looks like the HiRISE site may have changed since the last time we ran this. Could you just confirm which image link we should be downloading from here: http://hirise.lpl.arizona.edu/ESP_040246_0935

At the moment the download script is getting the RGB colour non-map projected JPEG file (http://hirise-pds.lpl.arizona.edu/PDS/EXTRAS/RDR/ESP/ORB_040200_040299/ESP_040246_0935/ESP_040246_0935_RGB.NOMAP.browse.jpg) but it seems to be expecting to grab a j2k/jp2 file. Should it be downloading the last file in the "JP2 Extras" section (http://hirise-pds.lpl.arizona.edu/download/PDS/EXTRAS/RDR/ESP/ORB_040200_040299/ESP_040246_0935/ESP_040246_0935_RGB.NOMAP.JP2)?

mschwamb commented 7 years ago

@astopy my understanding is the JP2 is processed because it is higher resolution than the browse.jpg Yep RGB.NOMAP.JP2 is what we want. Thanks for checking

adammcmaster commented 7 years ago

@mschwamb Thanks. I've run into another problem now -- the way the image metadata (i.e. acquisition date, coordinates, etc.) is marked up on the site has changed, so now it's all in one big table cell (instead of each item being in a separate cell). That's going to make it a lot harder to scrape the metadata. Before I start working on that, do we actually need all of that in Ouroboros? Or would it be good enough to just have the original name/download link to match them up later?

michaelaye commented 7 years ago

I actually have code to scrape meta-data from PDS summary files. So if you give me the column names you need I can provide that.

mschwamb commented 7 years ago

@michaelaye it might be better to provide @astopy with the code so we've got one pipeline, but that's up to @astopy. I think the P4 scripts are python if I recall correctly.

adammcmaster commented 7 years ago

Unfortunately the scripts are actually Ruby, but @michaelaye if you could share your code that'd be great and I'll find a way to adapt it.

michaelaye commented 7 years ago

Sorry, didn't see your reply. I still need to know which meta data columns are required, because some of them have several versions flying around and i need to make sure that i point you to the correct index file for that. it's unfortunately not as simple as pointing to a meta-data file for RGB.NOMAP.JP2s, because those don't exist.

mschwamb commented 7 years ago

@michaelaye should be the following for the non-map image

"coords":[-85.759,106.051],"location":{"standard":"http://www.planetfour.org/subjects/standard/5501938669736d5fdd000000.jpg"},"metadata":{"acquisition_date":"2013-01-03T00:00:00Z","emission_angle":1.0,"lat_centered":-85.759,"lng_east":106.051,"local_mars_time":" 6:01 PM","name":"ESP_030184_0940","north_azimuth":130.0,"phase_angle":68.2,"sub_solar_azimuth": 37.5,"time":null}

adammcmaster commented 7 years ago

@michaelaye There's also a full list here: https://github.com/zooniverse/planet-four/blob/master/data-import/fetch_source_file_and_metadata.rb#L33

michaelaye commented 7 years ago

I put a gist here: https://gist.github.com/702fff89930de42822d26e14fb182160

In my planetpy tools (pip install planetpy) there's a module called 'pdstools'.

In there I wrote highly generalized PDS index readers, that wrap the Python Parameter Value Language module PVL (The syntax for Planetary Data System label files). Relevant functions are in https://github.com/michaelaye/planetpy/blob/master/planetpy/pdstools.py

But possibly it's just more efficient for you that you just quickly parse the specific data files yourself: Download cumulative RDR index and label file here: http://hirise-pds.lpl.arizona.edu/PDS/INDEX/RDRCUMINDEX.LBL http://hirise-pds.lpl.arizona.edu/PDS/INDEX/RDRCUMINDEX.TAB The label file has the column names for the .TAB file, but in PVL format. The .TAB file is a fixed format text file. You can read out the column names and the column specification (i.e. start and end byte for each column) below, in case you can feed them to a Ruby text parser.

If this is all too much, we also can agree on a format and I can do the parsing for you and provide the results. Meg, I don't remember why we parse so many of those metadata items, if we don't include them in the classification data file? Where are they being used?

mschwamb commented 7 years ago

@michaelaye that info is in the daily mongo database dumps we get sent. We don't normally get it in csv since it changes only a few times a year and adding the info to each classification would make the database bigger. What columns were chosen was before I formally joined the science team. Given how close we are on the first paper, I think it would be better if you spend your time on that and @astopy tries to adapt your code. Do you agree?

michaelaye commented 7 years ago

Well, 95% of the work was already done last night, I just had to update my tools to meet standards, so it would have been stupid not to finish it already, in case @astopy decides, that to be efficient he just takes my CSV output. I updated the gist and added a csv file with the metadata for this data run, should all be there.

adammcmaster commented 7 years ago

Great, thanks @michaelaye. Unfortunately I'm pretty much out of time to get this imported into Ouroboros before our Christmas break here, as today's our last working day and I've got some data already being imported for another project that won't finish in time.

I'll be able to finish this up in the first week of January, so hopefully the current data will last until then.

mschwamb commented 7 years ago

@astopy - thanks. we're good on the live data front. Happy Holidays. Enjoy your much deserved holiday break!

adammcmaster commented 7 years ago

@michaelaye Thanks again. I've integrated your script into the data import tools here: https://github.com/zooniverse/planet-four/blob/master/data-import/fetch_metadata.py

@mschwamb New data should be ready either today or Monday.

adammcmaster commented 7 years ago

@mschwamb The data's in now!