sagecontinuum / sage-data-client

Official Sage data API client.
Other
5 stars 4 forks source link

:fire: Issue with downloading data using query #6

Open Brookluo opened 1 year ago

Brookluo commented 1 year ago

Hello,

I was trying to download SAGE imaging data using the query function. I tried to query as much data as possible with an early starting date, and I didn't specify the end date in the query so that the script will download all data available till today. However, I only downloaded a fraction of all available data. There were only 1220 rows in the data frame returned. The time span of the data downloaded is very short, which you can see from the image I attached. Screenshot 2023-05-23 at 1 55 41 PM

My query is

df = sage_data_client.query(
    start="2023-01-01T00:00:00",
    # end="2023-05-22T23:51:36.246454082Z",
    filter={
        "plugin": "*mobotix-scan*",
        # "vsn" : "W071"
    }
)

These images were from only four nodes ('V023', 'V032', 'W056', 'W057'). Again, these are not all nodes available, as shown on the SAGE website with specific plugins. The link is here. https://portal.sagecontinuum.org/query-browser?apps=registry.sagecontinuum.org%2Fbhupendraraut%2Fmobotix-scan%3A0.23.4.24

After some further research, we found that imaging data is from physically present nodes in Argonne. Could there be some issues with the code or internet connection?

I downloaded the sage-data-client package from PyPI, and the version is 0.5.0.post1. Please let me know if you need additional information.

Thanks, Yufeng

Brookluo commented 1 year ago

Bhupendra tested this code snippet, and it downloaded the expected data successfully.

df = sage_data_client.query(
    start="20230401-20:25:00",
    end="20230405-20:25:10",
    filter={
        "plugin": "*mobotix-scan.*",
         "name": "upload"
        #"vsn" : "V008"
    }
)

It seems the only difference is the format of the timestamp string.

seanshahkarami commented 1 year ago

Thanks for sharing!

After looking more closely, this may actually be related to how the wildcard matching is working. Notice that in one case you're filtering on *mobotix-scan* and in the other *mobotix-scan.*.

I'll double check to make sure something unexpected isn't happening when that's compiled to a regex internally.