Closed sigmafelix closed 9 months ago
@sigmafelix
This is good to know. Yesterday @eva0marques sent me an R package titled FedData. This package includes a function for downloading NLCD data. I will investigate the functions data source and methodology while waiting for discussion with OSC about gdal-bin package.
I figured it out with Frank in OSC to install gdal utilities in the highmem partitions and triton. gdal_translate
command will work in both, so we will be fine to write a few lines of script to convert Erdas Imagine file into GeoTIFF. Availability in geo cluster is pending. One potential problem to think about in the near future is to configure GitHub Action runner with gdal utilities to make all (future) tests for a pipeline pass.
Even converting .img file into .tif with gdal_translate
on Triton, I still find the strange behavior of returning a single-column data.frame. I could not figure out what exactly causes the problem. Perhaps we need to put a warning message to convert NLCD .img file into .tif using gdal_translate
locally.
Additional tests: in geo cluster, I converted .img to .tif: got the same erroneous results. The next experiment is to convert the file to a scratch folder in geo cluster. I heard from OSC that scratch space in geo cluster has been fixed to be accessible for users. Perhaps this issue is trivial, but I will experiment several approaches to identify the exact cause of the issue. I already ruled out terra
version issues after I tried 1.7.46 and 1.7.55 separately without issues in local.
Since this issue is not urgent, I will try these experiments time to time and share results until next Monday (12/18/2023).
I tried downloading directly to a ddn location in triton:
wget -O nlcd_2021.zip https://s3-us-west-2.amazonaws.com/mrlc/nlcd_2021_land_cover_l48_20230630.zip
mkdir nlcd2021_test
unzip -t nlcd2021_test nlcd_2021.zip
Used same script above with the unzipped file. Then I found
Cannot preload entire working area of 300294205 cells with max_cells_in_memory = 3e+07. Raster values will be read for each feature individually.
|======================================================================| 100%
frac_11 frac_12 frac_21 frac_22 frac_23 frac_24
1 5.753208e-06 0 0.0001813312 8.629811e-06 0.000000e+00 0.000000e+00
2 3.514474e-05 0 0.0014484638 2.755787e-03 1.240304e-03 1.467068e-04
3 0.000000e+00 0 0.0036716100 8.414406e-04 1.294472e-04 3.739585e-05
4 0.000000e+00 0 0.0002042396 3.333175e-03 3.363703e-03 8.054491e-05
5 0.000000e+00 0 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
6 0.000000e+00 0 0.0007550599 6.991125e-04 2.935899e-05 0.000000e+00
frac_31 frac_41 frac_42 frac_43 frac_52 frac_71
1 5.753208e-05 0.000000000 4.899567e-01 0.00000000 0.5089095 0.0008661693
2 1.794627e-01 0.000000000 8.629811e-06 0.00000000 0.8131807 0.0002286142
3 2.038847e-02 0.000000000 5.535273e-02 0.00000000 0.9092591 0.0103197088
4 4.602566e-04 0.006336919 2.493024e-01 0.01026623 0.7172992 0.0019057969
5 2.880687e-04 0.000000000 1.111000e-01 0.00000000 0.8674674 0.0209805164
6 1.542041e-03 0.000000000 0.000000e+00 0.00000000 0.7326260 0.2643483877
frac_81 frac_82 frac_90 frac_95
1 0 0 0.000000000 1.438302e-05
2 0 0 0.001475698 1.725962e-05
3 0 0 0.000000000 0.000000e+00
4 0 0 0.004418463 3.029064e-03
5 0 0 0.000000000 1.639664e-04
6 0 0 0.000000000 0.000000e+00
frac_11 frac_12 frac_21 frac_22 frac_23 frac_24
1 5.753208e-06 0 0.0001813312 8.629811e-06 0.000000e+00 0.000000e+00
2 3.514474e-05 0 0.0014484638 2.755787e-03 1.240304e-03 1.467068e-04
3 0.000000e+00 0 0.0036716100 8.414406e-04 1.294472e-04 3.739585e-05
4 0.000000e+00 0 0.0002042396 3.333175e-03 3.363703e-03 8.054491e-05
5 0.000000e+00 0 0.0000000000 0.000000e+00 0.000000e+00 0.000000e+00
6 0.000000e+00 0 0.0007550599 6.991125e-04 2.935899e-05 0.000000e+00
frac_31 frac_41 frac_42 frac_43 frac_52 frac_71
1 5.753208e-05 0.000000000 4.899567e-01 0.00000000 0.5089095 0.0008661693
2 1.794627e-01 0.000000000 8.629811e-06 0.00000000 0.8131807 0.0002286142
3 2.038847e-02 0.000000000 5.535273e-02 0.00000000 0.9092591 0.0103197088
4 4.602566e-04 0.006336919 2.493024e-01 0.01026623 0.7172992 0.0019057969
5 2.880687e-04 0.000000000 1.111000e-01 0.00000000 0.8674674 0.0209805164
6 1.542041e-03 0.000000000 0.000000e+00 0.00000000 0.7326260 0.2643483877
frac_81 frac_82 frac_90 frac_95
1 0 0 0.000000000 1.438302e-05
2 0 0 0.001475698 1.725962e-05
3 0 0 0.000000000 0.000000e+00
4 0 0 0.004418463 3.029064e-03
5 0 0 0.000000000 1.639664e-04
6 0 0 0.000000000 0.000000e+00
I think the issue was just because of a corrupted zip file.
In the long run, there needs a script that verifies the downloaded file is identical to the original in the server (e.g., using checksum, sha256, etc.). FYI, the next NLCD release is expected to be in 2025.
I will close this issue.
NLCD data provided from the official website is offered in zip files, where an Erdas Imagine file and its auxiliaries (ige, rrd, etc.) are compressed. When running
exactextractr::exact_extract
with these files on ddn using code at HPC (i.e., triton) below, the results are always a one-column data frame:However, when the same code (with path modification) in the local system, the results were as expected.
This is possibly due to the file system and technical specification of Erdas Imagine file format, seeing the file size difference in local and ddn:
Local
DDN
I also tried downloading NLCD zip file directly from the webpage to my local then uploaded unzipped files to the ddn, but the results were the same.
All problems considered, I converted the .img file(s) into a GeoTIFF file using
gdal_translate nlcd_2019_...img nlcd_2019_...tif
and uploaded the .tif file to ddn, the code above worked as expected:A NLCD preprocessing function or an additional part in the NLCD download function needs to be added to convert .img file to .tif file. A potential problem related to this is that DDN does not have gdal-bin package, so the installation requires an approval by OSC. We might use apptainer container for this task instead.
This issue is related to @eva0marques and @mitchellmanware . I suggest @Spatiotemporal-Exposures-and-Toxicology adding this as an agendum for the next meeting.