MarsRaw / mars-raw-utils

Utilities for working with publicly available raw MSL & Mars2020 images
MIT License
45 stars 10 forks source link

m20-fetch progress bar doesn't consider the ID filtered image count #38

Closed sschmaus closed 1 year ago

sschmaus commented 1 year ago

m20-fetch progress bar reports 110 images for this command (filtered to only fetch the color images):

mru m20-fetch -s 793 -c NAVCAM -f F_0793

this corresponds to the unfiltered image count while the filtered count is only 58

kmgill commented 1 year ago

This was a known issue since the progress bar total is populated directly from the results before the filter gets used. But, yes, this does need to be fixed.

kmgill commented 1 year ago

I should also note that this also effects MSL & NSYT fetch

sschmaus commented 1 year ago

nothing major luckily, but this was the first small issue I noticed today :) I found some other and bigger bugs/improvements which I'll follow up on later.

kmgill commented 1 year ago

The stated issue was a result of how the remote image fetching worked. The program makes a request to the remote outreach site API which returns a list of images and a count of the total number of images in a resultset which may be spread across a number of pages (each paging being set at 100 images each). That total is then reported back to the user as a number in the progress bar. Then the program iterates each image record in the results, passes the filter parameter over them removing any non-matches, then downloads the image. When completed with that set, the program moves on to the next page, rinse and repeat. For each image downloaded, the program reports back to the user with an output table row and incrementing the progress bar by one. The problem arose when images would fail the filter and not be downloaded, thus not triggering the progress bar to advance. So at the end the progress bar would have the total downloaded / total, where the total downloaded may be less than the total due to the filter. The easiest fix could have been to trigger the progress bar regardless of whether or not it passes the filter, but my concern was twofold:

Thus I took the opportunity to rewrite the remote fetch functions. This is the code that supports any call to the public outreach raw image sites including for the images themselves and for metadata and a list of the latest sols. The interfaces are now controlled by the traits and generic output structs in remotequery, and each mission implements those traits that then plug into wrapper functions in remotequery. Each mission (currently Mars2020, MSL, and InSight) are also now able to implement the traits in a more parallel/asynchronous manner.

The new design changes the workflow like this: the program makes the initial query to the remote image api and then 1) converts the image records to generic Metadata objects, filters them, and stores them in a vector. And 2) determines if more pages are required and if so, queries for them and does the same conversion and filtering. The vector is then printed as a table to the commandline, length (which represents the total number of images to be downloaded...Remember, they've already had the filters applied) reported back to the user, then iterated asynchronously and images downloaded, each being reported back to the user as an increment to the progress bar.

Now, to fetch images, use the remotequery module and specify the missions using enums::Mission.

use mars_raw_utils::prelude::*;
use mars_raw_utils::remotequery::RemoteQuery;

match remotequery::perform_fetch(
    Mission::MSL,
    &RemoteQuery {
        cameras: vec!["CHEMCAM"],
        num_per_page: 100,
        page: None,
        minsol: 3838,
        maxsol: 3838,
        thumbnails: false,
        movie_only: false,
        list_only: false,
        vec!["PRC"],
        only_new: true,
        vec![],
        output_path: "/data/MSL/3838/CCAM/".to_string(),
    },
    |total| {
        // Set the progressbar total ...
    },
    |_| {
        // Increment the progressbar ...
    },
)
.await
{
    Ok(_) => println!("Done"),
    Err(why) => eprintln!("Error: {}", why),
};

To fetch the list of latest images:

use mars_raw_utils::prelude::*;

if let Ok(latest) = remotequery::get_latest(Mission::MSL).await {
    println!("Latest data: {}", latest.latest());
    println!("Latest sol: {}", latest.latest_sol());
    println!("Latest sols: {:?}", latest.latest_sols());
    println!("New Count: {}", latest.new_count());
    println!("Sol Count: {}", latest.sol_count());
    println!("Total: {}", latest.total());
} else {
    eprintln!("Error");
}

Still to do is a better means of registering the mission implementations with the remotequery module. The missions should not be specifically hardcoded to known by remotequery (it currently is), the registration should happen at compile or even runtime.

So what does all the nonsense mean for the user? Not much... The progress bar is more accurate, so there's that. API-wise, this change simplifies usage by user interfaces, be that in the current command line code or in a future GUI.

sschmaus commented 1 year ago

Nice work, Kevin!

kmgill commented 1 year ago

Completed