USF-IMARS / wv-land-cover

:earth_americas: Processing scripts for decision-tree land use classification on worldview 2 imagery
5 stars 5 forks source link

upload Rookery + jobos Rrs files #47

Open 7yl4r opened 1 year ago

7yl4r commented 1 year ago

File lists:

\home1\datashare\regions\jobos\Processed\JobosFinal\FinalJobosForGEE\

\home1\datashare\regions\rookery\Processed\wv_atm_corrected\

7yl4r commented 1 year ago

Rookery

[x] create rookery-wv-rrs google cloud bucket.

Set to single region (US-east). Data class set to coldline. $0.004 per GB-month

[x] Copy files to cloud bucket

(base) tylar@manglilloo:/srv/imars-objects/rookery/Processed/RookeryFinal/ReprocessedWithDEM/FirstClassificVersion$ cat ~/RookeryFilelistRrs.txt | gsutil -m cp -I gs://rookery-wv-rrs
Copying file://20100301T162229_01_P009_WV02_Rrs_Rookery-wDEM_v3_DEM.tif [Content-Type=image/tiff]...
Copying file://20100301T162230_01_P010_WV02_Rrs_Rookery-wDEM_v3_DEM.tif [Content-Type=image/tiff]...
==> NOTE: You are uploading one or more large file(s), which would run          
significantly faster if you enable parallel composite uploads. This
feature can be enabled by editing the
"parallel_composite_upload_threshold" value in your .boto
configuration file. However, note that if you do this large files will
be uploaded as `composite objects
<https://cloud.google.com/storage/docs/composite-objects>`_,which
means that any user who downloads such objects will need to have a
compiled crcmod installed (see "gsutil help crcmod"). This is because
without a compiled crcmod, computing checksums on composite objects is
so slow that gsutil disables downloads of composite objects.

Copying file://20100323T162209_01_P004_WV02_Rrs_Rookery-wDEM_v3_DEM.tif [Content-Type=image/tiff]...
Copying file://20100323T162209_01_P004_WV02_Rrs_Rookery-wDEM_v3_DEM.tif [Content-Type=image/tiff]...
Copying file://20100323T162210_01_P005_WV02_Rrs_Rookery-wDEM_v3_DEM.tif [Content-Type=image/tiff]...
[...]

[x] cp bucket files to gee

tasks started.

(base) tylar@manglilloo:~/wv-land-cover$ bash gee-uploads/gbucket_to_gee_w_metadata_rookery_rrs.sh rookery-wv-rrs /srv/imars-objects/rookery/Processed/wv_ortho_xml/ users/tylarmurray/nerrs_rookery_rrs_v01 | tee rookery_rrs_upload-2023_08_14.log
checking if the collection users/tylarmurray/nerrs_rookery_rrs_v01 exists...
collection created.
[...]
*** Transfering file  20200929T162715_03_P007_WV03_Rrs_Rookery-wDEM_v3_DEM ***
*** parsing metadata...
{"dt_Y": "2020", "dt_m": "09", "dt_d": "29", "dt_H": "16", "dt_M": "27", "dt_S": "15", "number": "03", "pass_n": "007", "sat_n": "03", "adjustments_version": "v3", "dt_a": "Tue", "dt_A": "Tuesday", "dt_w": "2", "dt_dd": "29", "dt_b": "Sep", "dt_B": "September", "dt_mm": "9", "dt_y": "20", "dt_HH": "16", "dt_I": "04", "dt_II": "4", "dt_p": "PM", "dt_MM": "27", "dt_SS": "15", "dt_f": "000000", "dt_z": "", "dt_Z": "", "dt_j": "273", "dt_jj": "273", "dt_U": "39", "dt_W": "39", "dt_c": "Tue Sep 29 16:27:15 2020", "dt_x": "09/29/20", "dt_X": "16:27:15"}
*** estimating xml filename...
xml fname is like: 20SEP29162715-M1BS-*_03_P007.XML
*** searching for xml file...
found file: /srv/imars-objects/rookery/Processed/wv_ortho_xml/20SEP29162715-M1BS-504649660010_03_P007.XML
*** extracting properties from .xml...
 -p IMD_NUMROWS=9911  -p IMD_NUMCOLUMNS=10651  -p ABSCALFACTOR_BAND_C=0.01397474  -p ABSCALFACTOR_BAND_B=0.01772364  -p ABSCALFACTOR_BAND_G=0.01316364  -p ABSCALFACTOR_BAND_Y=0.00672  -p ABSCALFACTOR_BAND_R=0.01020364  -p ABSCALFACTOR_BAND_RE=0.00606316  -p ABSCALFACTOR_BAND_N=0.01170909  -p ABSCALFACTOR_BAND_N2=0.01034947  -p FIRSTLINETIME=2020-09-29_16:27:15.768850  -p MEANSUNEL=59.1  -p MEANSUNAZ=155.4  -p MEANSATEL=51.7  -p MEANSATAZ=236.3  -p MEANOFFNADIRVIEWANGLE=34.3  -p CLOUDCOVER=0.452  -p MEANINTRACKVIEWANGLE=-22.8  -p MEANCROSSTRACKVIEWANGLE=-26.4  -p SATID=WV03  -p MODE=FullSwath  -p SCANDIRECTION=Forward  -p FILENAME=20SEP29162715-M1BS-504649660010_03_P007.NTF 
*** formatting ts for gee...
2020-09-29T16:27:15
*** transferring image and metadata...
Started upload task with ID: LJAH72EFYXXCKJFKOZR6LVCR
done!

190 tasks in gee tasks done.

[x] set ImageCollection asset permissions

added users:

7yl4r commented 1 year ago

Jobos

[x] create jobos-wv-rrs google cloud bucket.

Set to single region (US-east). Data class set to coldline. $0.004 per GB-month

[x] Copy files to cloud bucket

tylar@manglilloo:/srv/imars-objects/jobos/Processed/JobosFinal/ReprocessedWithDEM/FirstClassificVersion$ cat ~/JobosFilelistRrs.txt | gsutil -m cp -n -I gs://jobos-wv-rrs | tee ~/jobos-rrs-upload.log
Skipping existing item: gs://jobos-wv-rrs/20100201T150638_01_P006_WV02_Rrs_Jobos-wDEM_v3_DEM.tif
[...]
Skipping existing item: gs://jobos-wv-rrs/20171024T151512_01_P006_WV02_Rrs_Jobos-wDEM_v3_DEM.tif

[] cp bucket files to gee

bash gee-uploads/gbucket_to_gee_w_metadata_jobos_rrs.sh jobos-wv-rrs /srv/imars-objects/jobos/Processed/wv_ortho_xml/ users/tylarmurray/nerrs_jobos_rrs_v01 | tee jobos_rrs_upload-2023_09_02.log
[...]
Transfering file  20210220T145743_01_P006_WV02_Rrs_Jobos-wDEM_v3_DEM ***
*** parsing metadata...
{"dt_Y": "2021", "dt_m": "02", "dt_d": "20", "dt_H": "14", "dt_M": "57", "dt_S": "43", "number": "01", "pass_n": "006", "sat_n": "02", "adjustments_version": "v3", "dt_a": "Sat", "dt_A": "Saturday", "dt_w": "6", "dt_dd": "20", "dt_b": "Feb", "dt_B": "February", "dt_mm": "2", "dt_y": "21", "dt_HH": "14", "dt_I": "02", "dt_II": "2", "dt_p": "PM", "dt_MM": "57", "dt_SS": "43", "dt_f": "000000", "dt_z": "", "dt_Z": "", "dt_j": "051", "dt_jj": "51", "dt_U": "07", "dt_W": "07", "dt_c": "Sat Feb 20 14:57:43 2021", "dt_x": "02/20/21", "dt_X": "14:57:43"}
*** estimating xml filename...
xml fname is like: 21FEB20145743-M1BS-*_01_P006.XML
*** searching for xml file...
found file: /srv/imars-objects/jobos/Processed/wv_ortho_xml/21FEB20145743-M1BS-505417666010_01_P006.XML
*** extracting properties from .xml...
 -p IMD_NUMROWS=7168  -p IMD_NUMCOLUMNS=9216  -p ABSCALFACTOR_BAND_C=0.00909474  -p ABSCALFACTOR_BAND_B=0.01257455  -p ABSCALFACTOR_BAND_G=0.00963636  -p ABSCALFACTOR_BAND_Y=0.00501895  -p ABSCALFACTOR_BAND_R=0.01098462  -p ABSCALFACTOR_BAND_RE=0.00447579  -p ABSCALFACTOR_BAND_N=0.01217436  -p ABSCALFACTOR_BAND_N2=0.00888421  -p FIRSTLINETIME=2021-02-20_14:57:43.821850  -p MEANSUNEL=52.1  -p MEANSUNAZ=136.9  -p MEANSATEL=57.8  -p MEANSATAZ=167.3  -p MEANOFFNADIRVIEWANGLE=28.3  -p CLOUDCOVER=0.264  -p MEANINTRACKVIEWANGLE=-26.2  -p MEANCROSSTRACKVIEWANGLE=11.0  -p SATID=WV02  -p MODE=FullSwath  -p SCANDIRECTION=Forward  -p FILENAME=21FEB20145743-M1BS-505417666010_01_P006.NTF 
*** formatting ts for gee...
2021-02-20T14:57:43
*** transferring image and metadata...
Started upload task with ID: AS342KDZIB73Z4355CUEN42U
done!

All jobs submitted; awaiting tasks to finish in GEE task manager.

[x] set ImageCollection asset permissions

added users:

7yl4r commented 1 year ago

It appears the command terminated after disconnecting from the ssh session. Running the command again and I get a lot of "resuming upload messages".

Only one image uploaded before it disconnected. I am running again through a virtual screen and including a | tee rookery-rrs-upload.log in the hopes that it will not disconnect this time.

7yl4r commented 1 year ago

the rookery upload is complete:

(base) tylar@manglilloo:/srv/imars-objects/rookery/Processed/RookeryFinal/ReprocessedWithDEM/FirstClassificVersion$ cat ~/RookeryFilelistRrs.txt | gsutil -m cp -n -I gs://rookery-wv-rrs | tee ~/rookery-rrs-upload.log
[...]
Skipping existing item: gs://rookery-wv-rrs/20200710T162813_01_P001_WV02_Rrs_Rookery-wDEM_v3_DEM.tif
Skipping existing item: gs://rookery-wv-rrs/20181102T160526_01_P011_WV02_Rrs_Rookery-wDEM_v3_DEM.tif
Skipping existing item: gs://rookery-wv-rrs/20181112T163728_01_P009_WV02_Rrs_Rookery-wDEM_v3_DEM.tif
7yl4r commented 1 year ago

We have an issue with the jobos image sizes:

image

7yl4r commented 1 year ago

I misunderstood the error message. I have hit the limit for total assets size on my account (relevant s/e). The best way to address this is to set up the tif images as COGs. I need to find the best way to do that.

7yl4r commented 1 year ago

As a workaround, we can upload the jobos images into another user's account. For the more elegant solution of using COGs & GCS I have opened #48.