Closed noorvanbeers closed 1 year ago
I found a solution to my issue! The link that is generated by the _get_location function in .granule_handler.py leads to a redirect link, which was blocked. I have now amended this function to the function below. Note that the database_LAADS boolean is added to the download_from_granules function, and parsed by each subsequent function.
@staticmethod
def _get_location(url: HttpUrl, session: Session, database_LAADS: bool) -> str:
"""Make initial request to fetch file location from header."""
split_result = urlsplit(url)
https_url = split_result._replace(scheme="https").geturl()
if database_LAADS:
location_resp = session.get(https_url, allow_redirects=True)
location = location_resp.url
else:
location_resp = session.get(https_url, allow_redirects=False)
location = location_resp.headers.get("Location")
if not location:
raise FileNotFoundError("No file location found")
return location
The other two amendments stated in my initial issue are 1) adding the LAADS resource url in resources.py:
""" URLs for the API """
from enum import Enum
class URLs(Enum):
"""URLs"""
API: str = "cmr.earthdata.nasa.gov"
URS: str = "urs.earthdata.nasa.gov"
RESOURCE: str = "e4ftl01.cr.usgs.gov"
NSIDC_RESOURCE: str = "n5eil01u.ecs.nsidc.org"
LAADS_RESOURCE: str = "ladsweb.modaps.eosdis.nasa.gov"
EARTHDATA: str = ".earthdata.nasa.gov"
And 2) adding it to the get_url_from_granule function in granule_handler.py:
@staticmethod
def get_url_from_granule(granule: Granule, ext: ParamType = "hdf") -> HttpUrl:
"""Return link for file extension from Earthdata resource."""
for link in granule.links:
if (
link.href.host
in [
URLs.RESOURCE.value,
URLs.NSIDC_RESOURCE.value,
URLs.LAADS_RESOURCE.value,
]
and link.href.path.endswith(ext)
):
return link.href
raise Exception("No matching link found")
As I suspected it was an easy addition for the LAADS archive to be accessible with the modis_tools package; thank you again for the library!
Hi @noorvanbeers sorry we weren't able to get to your thread here until you found a resolution. Do you anticipate that the solution you've outlined requires a change to the modis-tools
codebase or it's documentation?
Hi @jamie-sgro, I found this issue and I'm just commenting to say that I also wanted to use modis-tools
to access data from the LAADS Archive, and ended up using pretty much the same solution as @noorvanbeers (just checking if url.host == "ladsweb.modaps.eosdis.nasa.gov"
instead of adding a new bool argument). I think it would be great if this simple fix could be added to the codebase.
Hi @polpel, thanks for documenting what worked for you here. If you'd like, I'd absolutely invite you to create a small PR with those changes in mind and I'd have the team review it in short order. Otherwise I've made a task to have our team revisit your note here which will like likely result in new PR within the next week or so
Thank you for this great library! I would like to request the LAADS archive to be added, as I have not been able to add this myself. I saw from issue #3 that this is possible and has been implemented for another archive, however my attempts haven't been fruitful.
Is your feature request related to a problem? Please describe. I am trying to access data in the MYDATML2 and MODATML2 collections in the LAADS archive, also available from CMR. This is currently not possible with the most recent version of modis_tools. I have attempted to add the LAADS archive in the same way that the NSIDC DAAC archive was added (downloading from the NSIDC DAAC archive is working for me):
1) In .constants.urls.py I have added "ladsweb.modaps.eosdis.nasa.gov", named LAADS_RESOURCE 2) I have added this (URLs.LAADS_RESOURCE.value) to .granule_handler.py in the get_url_from_granule function, under URLs.NSIDC_RESOURCE.value
The url(s) generated from this is correct. When printed, I can click them and download the file from the website. An example is shown below with the following inputs:
However a missingSchema (invalid URL) error is raised:
When I prefix the LAADS_RESOURCE with "https://", an exception is raised that no matching link is found.
Describe the solution you'd like If this could be implemented I would be very grateful! The library is excellent and I feel this added feature would be a small addition to implement!