capstone-coal / coal-sds

An Apache OODT-powered Science Data System for COAL
Apache License 2.0
2 stars 3 forks source link

Implement OODT Metadata Extractors for pycoal data products #6

Open lewismc opened 6 years ago

lewismc commented 6 years ago

Now that we have basic file management set up, we should look at implementing OODT Metadata Extractors for all products consumed by (e.g. AVIRIS-C/NG imagery and accompanying .hdr files, the spectral libraries and hydrology datasets) pycoal as well as generated by pycoal e.g. mineral, mining and environment classifications. This will product a much richer metadata model enabling us to improve the cataloguing functionality.

@bdegley4789 can you please list all of the resources pycoal consumes and products it generates?

@thomkenn can you please start working on this when you get a chance?

bdegley4789 commented 6 years ago

Sorry for the late reply. I missed the email for this notification

bdegley4789 commented 6 years ago

Pycoal Consumes Input header File = ang20150420t182050_corr_v1e_img.hdr Input image = ang20150420t182050_corr_v1e_img Can be found here: ftp://avng.jpl.nasa.gov/AVNG_2015_data_distribution/L2/ang20150420t182050_rfl_v1e/

Spectral Library header file = s06av95a_envi.hdr Can be found here: ftp://ftpext.cr.usgs.gov/pub/cr/co/denver/speclab/pub/spectral.library/splib06.library/Convolved.libraries/s06av95a_envi.hdr

vector file = Shape/NHDFlowline.shp Can be found here: ftp://rockyftp.cr.usgs.gov/vdelivery/Datasets/Staged/Hydrography/NHD/State/HighResolution/Shape/NHD_H_New_Mexico_Shape.zip

Pycoal Generates rgb file = ang20150420t182050_corr_v1e_img_rgb.hdr & ang20150420t182050_corr_v1e_img_rgb.img

classified file = ang20150420t182050_corr_v1e_img_class.hdr & ang20150420t182050_corr_v1e_img_class.img

mining file = ang20150420t182050_corr_v1e_img_class_mining.hdr & ang20150420t182050_corr_v1e_img_class_mining.img

environmental correlation file = ang20150420t182050_corr_v1e_img_class_mining_NHDFlowline_correlation.hdr & ang20150420t182050_corr_v1e_img_class_mining_NHDFlowline_correlation.img

ang20150420t182050_corr_v1e_img_class_mining_NHDFlowline_proximity.hdr & ang20150420t182050_corr_v1e_img_class_mining_NHDFlowline_proximity.img

ang20150420t182050_corr_v1e_img_class_mining_NHDFlowline_correlation.hdr & ang20150420t182050_corr_v1e_img_class_mining_NHDFlowline_correlation.img

All of these generated currently staged here https://drive.google.com/drive/folders/1YVhdLxvrZE3eC97OEXathLMJRgWt8haO

lewismc commented 6 years ago

Hi @thomkenn did @bdegley4789 pass on the instruction to begin working on this? I want to make sure we are making progress on COAL-SDS, this means some tanglible progress from week to week. Each issue left in this repository should take no longer than 1 week (12 working hours) to complete as all of the relevant documentation and resources e,g, community mailing lists are available. Please keep me in the loop with progress, thank you.

thomkenn commented 6 years ago

Yes, he has brought me up to speed, and we are fully committed to making real week to week progress!

lewismc commented 6 years ago

OK doke, if you could try to get a pull request completed for this coming week it would be great. Also, please make sure to augment the meeting notes ahead of time with any progress. Thank you I appreciated it.

lewismc commented 6 years ago

@thomkenn I should also say, if there is anything you are stuck on, PLEASE let me know. I can put aside time for an hour or so to resolve any issues. The idea is for the work to go on between meeting such that we have a quick tag up on Thursdays. Please keep me up-to-speed with what is going on such that I can help keep things on track. Thank you.

lewismc commented 6 years ago

How are things coming along @thomkenn ?

lewismc commented 6 years ago

PING @thomkenn

thomkenn commented 6 years ago

gah, sorry, didnt see your ping, these keep going to my spam folder for some reason. i have a possible commit i want to discuss in the meeting, as well as an error i saw while testing.

lewismc commented 6 years ago

Thanks, please push your proposed possible commit to your remote repository on a feature branch named ISSUE-6. We will discuss when we catch up today. Thanks

On Thu, Mar 15, 2018 at 06:37 thomkenn notifications@github.com wrote:

gah, sorry, didnt see your pin, these keep going to my spam folder for some reason. i have a possible commit i want to discuss in the meeting, as well as an error i saw while testing.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/capstone-coal/coal-sds/issues/6#issuecomment-373377862, or mute the thread https://github.com/notifications/unsubscribe-auth/ABHJl3Z2wTfsU6UT7xHh1AOMW94IJ8a2ks5tem56gaJpZM4SUaiB .

--

Lewis Dr. Lewis J. McGibbney Ph.D, B.Sc Skype: lewis.john.mcgibbney

lewismc commented 6 years ago

@thomkenn dont work on this I will do it. Skip over to https://github.com/capstone-coal/pycoal/issues/105 and work with @bdegley4789 thanks.

lewismc commented 5 years ago

We need to add the following commands to the crawler_launcher tool execution This is documented at the following https://cwiki.apache.org/confluence/display/OODT/Using%2BTikaCmdLineMetExtractor