frewsxcv / rust-crates-index

Rust library for retrieving and interacting with the crates.io index
https://docs.rs/crates-index/
Apache License 2.0
72 stars 37 forks source link

Support Offline mode - for users with intermittent Internet #85

Closed apps4uco closed 1 year ago

apps4uco commented 2 years ago

Hi,

Possibly related to #44 it would be great to have an offline mode. I too live in an area where there is not 100% internet.

It currently appears that there has to be an internet connection in order to use the crate. (If this is not the case could the documentation be updated to show how)

It would be great to have an offline configuration or even better a warning or Status struct that the local mirror of the crates.io-index cannot be synchronized and so results may be stale.

There could be various options, offline = No Internet, cache-only = Have Internet but dont download anything from github check = Have internet only check for updates on github but without downloading (produce a warning or Status struct) normal = Current functionality

Thanks

kornelski commented 2 years ago

I think it already does that. If you have the index on your disk, you can use it. What exactly needs to be changed?

apps4uco commented 2 years ago

Hi,

The code I am using is

 let index = crates_index::Index::new("/opt/git/crates.io-index/");

    for crate_releases in index.crates() {
        let recent = crate_releases.most_recent_version(); // newest version
        let crate_version = crate_releases.highest_version(); // max version by semver
        println!("crate name: {}", crate_version.name());
        println!("crate version: {}", crate_version.version());
    }

If there is no internet connection I get the error:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Git(Error { code: -1, klass: 12, message: "failed to resolve address for github.com: nodename nor servname provided, or not known" })'

When there is a connection the code works as expected.

The ~/.cargo/config.toml has

[source.mirror]
registry = "file:///opt/git/crates.io-index/"

[source.crates-io]
replace-with = "mirror"

Is there another way to use the index offline?

kornelski commented 2 years ago

Perhaps your checkout doesn't have the origin set? Or you're pointing it to a wrong directory?

It fetches if this is false:

        let exists = git2::Repository::discover(&path)
            .map(|repository| {
                repository
                    .find_remote("origin")
                    .ok()
                    // Cargo creates a checkout without an origin set,
                    // so default to true in case of missing origin
                    .map_or(true, |remote| remote.url().map_or(true, |u| u == url))
            })
            .unwrap_or(false);
apps4uco commented 2 years ago

Hi,

I did a println!("Url {}",u); in your code snippet and got Url https://github.com/rust-lang/crates.io-index.git

I originally made the copy of crates.io by doing a git clone https://github.com/rust-lang/crates.io-index.git (i.e. using the url on the github webpage)

Working code (for me at least):

let path="/opt/git/crates.io-index/";
let url= "https://github.com/rust-lang/crates.io-index.git";

//The following line determines the repo doesnt exist : 
//let index = crates_index::Index::new_cargo_default().unwrap();

//This works for me even if there is no network connection
let index = crates_index::Index::with_path(path,url).unwrap();

So maybe its my bad, that I didnt use your crate to create the clone of the repo in the first place, however, it might be a good idea to make it clearer in the documentation, or add some logging, that the repo exists but that it doesnt have the expected name.

Thanks and sorry to bother you.

Byron commented 1 year ago

In the latest release it's made easier to see if the local index is already present by checking the directory. If that's the case, it's very unlikely that opening that index will trigger a network operation (but not impossible as already pointed out here).

Otherwise, I was also tempted to re-add a way of opening an index without possibly cloning it, but shied away due to time constraints. PRs are definitely welcome.

Byron commented 1 year ago

In the most recent release, v2.1, there are new GitIndex::try_new_… versions which will not try to clone a non-existing index. As crate_() accesses have always been offline, I believe this concludes this issue.

If anything is missing though, please let me know.