rust-itertools / itertools

Extra iterator adaptors, iterator methods, free functions, and macros.
https://docs.rs/itertools/
Apache License 2.0
2.72k stars 309 forks source link

Spring Cleaning #702

Open jswrenn opened 1 year ago

jswrenn commented 1 year ago

I'm planning on kicking off a 'spring cleaning' of itertools in the next week or two. My goals are to:

  1. rustfmt the repo and integrate rustfmt into CI
  2. close all PRs that have been pending author responses for a long time
  3. triage remaining cleanup PRs and issues

More controversially, I suspect, I'd like to propose that we prefix all itertools methods with it_. The stability hazard caused by itertools conflicting with new additions to Iterator:

I had hoped that any number of language-level fixes for this issue would have landed by now, but no progress has been made on this. We can take action ourselves by prefixing itertools methods with it_. I'd like feedback on this proposal.

cc @phimuemue

phimuemue commented 1 year ago

Hi @jswrenn.

  • rustfmt the repo and integrate rustfmt into CI

Nice idea. You know my stance regarding "ill-formatted" PRs: I'd prefer a solution that accepts such PRs and applies rustfmt afterwards, but since we had so many huge PRs because people simply run rustfmt on save, I'm by now fine with any pragmatic solution here.

  • close all PRs that have been pending author responses for a long time
  • triage remaining cleanup PRs and issues

Nice.

More controversially, I suspect, I'd like to propose that we prefix all itertools methods with it_. The stability hazard caused by itertools conflicting with new additions to Iterator:

* continues to be a headache for the Rust libs team

Controversial indeed. Honestly I was not enthusiastic when I read it, but I'm sure you bothered yourself a lot about this. Moreover, I like problems to be solved where they should be solved - and if Itertools hinders progress in the Rust libs, we should strive to resolve this.

* forces us to discourage certain contributions to itertools in our README
* undermines the mission of this crate as a laboratory for Iterator improvements

I was not fully aware of the fact that Itertools is a laboratory for Iterator improvements (I had something in the back of my mind, but was not positively sure about this). I personally like this whole laboratory idea - in particular if it leads to upstreaming certain functionality into std::iter::Iterator.

IMHO, We can go in that direction, but we should make this explicit.

In addition: If we are already the "experimental Iterator improvers", could/should we relax the MSRV to fully exploit whatever is possible in (stable) Rust? (This is surely also controversial and fully orthogonal to the previous thoughts.)

tl;dr: Your suggestion is fine with me, and our README should explicitly state our rationale.

I do not know what time window you imagined, but, as always, a bit more feedback may be helpful here.

jswrenn commented 1 year ago

Controversial indeed. Honestly I was not enthusiastic when I read it, but I'm sure you bothered yourself a lot about this. Moreover, I like problems to be solved where they should be solved - and if Itertools hinders progress in the Rust libs, we should strive to resolve this.

I'm also not particularly enthusiastic about this solution, but it's an immediate thing we can do to help reduce conflicts between Iterator and Itertools, and — should rustc ever implement edition-aware method resolution or supertrait item shadowing — it's a decision we can reverse just by dropping the it_ prefixes.

In addition: If we are already the "experimental Iterator improvers", could/should we relax the MSRV to fully exploit whatever is possible in (stable) Rust? (This is surely also controversial and fully orthogonal to the previous thoughts.)

Yes, I think we should! The stabilization of const generics is a compelling reason, on its own, for us to bump the MSRV.

Philippe-Cholet commented 1 year ago

There is quite some changes to be released since "0.10.5" right? I think all that should be released, then only then make a release with it_ prefixes and MSRV bump as a "new start" for the crate.

jswrenn commented 1 year ago

@Philippe-Cholet Agreed!

@Philippe-Cholet, @phimuemue I'm weighing whether we should include a transitional release, in which all of the methods are present with both their current names (but deprecated) and their it_ prefixed names. Thoughts?

Philippe-Cholet commented 1 year ago

I guess the major con would be the "copy/paste" work on our end (and delete later), and the major pro would be for users to have time to adapt especially if they did not mention a precise version in their Cargo.toml file, like itertools = "0" (but it seems a bad idea in the first place as they are small breaking changes once in a while), but they can just make their version more precise as an easy temporary fix. I would personally go with the non-deprecated way but maybe I'm missing something here?

@jswrenn A bit off-topic but since you seem to handle version bumps, I would like to point out my recent #738 for you to think about updating this place in the future.

phimuemue commented 1 year ago

I tried to automate the process of introducing the it_ variants and deprecating the old ones, but my Vim-Fu is apparently not good enough to do it reliably.

Is there some trick that we could use? I tried the following, but this did not work:

mod itertools {
    #[deprecated]
    pub trait ItertoolsDeprecated : Sized {
        fn doit(self) {}
    }

    #[allow(deprecated)]
    pub trait Itertools : ItertoolsDeprecated {
        fn it_doit(self) {}
    }

    #[allow(deprecated)]
    impl<S: Iterator> ItertoolsDeprecated for S {}
    #[allow(deprecated)]
    impl<S: ItertoolsDeprecated> Itertools for S {}
}

use itertools::Itertools;

fn main() {
    [1,2,3].iter().doit();
}

if they did not mention a precise version in their Cargo.toml file, like itertools = "0" (but it seems a bad idea in the first place as they are small breaking changes once in a while)

I somewhat like the reasoning to discourage bad ideas, but I am not sure this is universally bad: I pretty much everywhere use itertools = "0" -- to get updates, and I explicitly accept the breaking changes. I would guess I'm not the only one.

That said: If we find an easy way of deprecating items, let's do it.

Otherwise I'm fine with releasing releasing the it_-versions, mostly because I do not want to force anyone to do the toil work of introducing the deprecated versions and making sure they do what they say.

(I think I once heard of a tool that can operate specifically on rust code to help with these transformations. It apparently accepted a macro-like expression that you could match against your code and have the engine work on the AST of the matched part. Maybe this could help, but I do not remember the details anymore.)

Philippe-Cholet commented 1 year ago

I was more thinking of

mod itertools {
    pub trait Itertools : Iterator {
        #[deprecated(note = "Use it_doit version instead, since = "0.12.0")]
        fn doit(self) where Self: Sized {}
        fn it_doit(self) where Self: Sized {}
    }
    impl<I: Iterator> Itertools for I {}
}

use itertools::Itertools;
fn main() {
    [1,2,3].iter().doit();
    [1,2,3].iter().it_doit();
}

The question is how to generate that.

I first thought of the paste crate (would be a temporary dependency) to make a macro_rules! deprecate_non_it {...} and then deprecate_non_it! { ///doc pub trait Itertools: Iterator { ... }} but I have a hard time matching against all methods... (it's awful!)

Then there is the possibility of a temporary proc-macro to do #[our_proc_macro_name] pub trait Itertools ... (syn and quote as temporary dependencies). Then some parsing... Cleaner but I prefer the next option.

Finally, still with syn/quote as temporary dependencies, a proc-macro deprecated_alias applied on each method, something like

pub trait Itertools: Iterator {
    /// doc
    #[deprecated_alias(doit, since = "0.12.0")] // easy enough to add and easy to remove later
    // other attributes
    fn it_doit(self) where Self: Sized {}
}

It would be a simple enough proc-macro (I made a more generalized version within 100 lines of code, small doc included, see details below). It basically extends to

pub trait Itertools: Iterator {
    /// doc
    // other attributes
    fn it_doit(self) where Self: Sized {}

    /// Use `it_doit` instead.
    #[deprecated(note = ``Use `it_doit` instead.``, since = "0.12.0")]
    // other attributes
    fn doit(self) where Self: Sized {}
}
`deprecated_alias` implementation (click) It's still a draft but it should work just fine. use proc_macro::TokenStream; use quote::quote; use syn::{parse_macro_input, Error, Ident, ItemFn, LitStr, Token}; /// Valid forms are: /// ```ignore /// #[deprecated_alias( /// old_name, /// /*opt*/ since="0.1.5", /// /*opt*/ note="Use `new_name` instead.", /// /*opt*/ doc="Use `new_name` instead.", /// )] /// ``` /// /// No default for `since`. /// `note` defaults to ``"Use `new_name` instead."``. /// `doc` defaults to `note`. /// /// In the future, it might accept more than functions and methods. #[proc_macro_attribute] pub fn deprecated_alias(args: TokenStream, input: TokenStream) -> TokenStream { let new_func = parse_macro_input!(input as ItemFn); let DeprecatedAlias { alias, since, note, doc, } = parse_macro_input!(args as DeprecatedAlias); let mut old_func = new_func.clone(); old_func.sig.ident = alias; // Remove doc to avoid duplicated doctests. old_func.attrs.retain(|attr| { let meta = attr.parse_meta().unwrap(); !meta.path().is_ident("doc") }); let since = since.map_or_else(Default::default, |s| quote!(since = #s,)); let note = note.unwrap_or_else(|| format!("Use `{}` instead.", new_func.sig.ident)); let doc = doc.unwrap_or_else(|| note.clone()); TokenStream::from(quote!( #new_func #[doc = #doc] #[deprecated(#since note = #note)] #old_func )) } #[derive(Debug)] struct DeprecatedAlias { alias: Ident, since: Option, note: Option, doc: Option, } impl syn::parse::Parse for DeprecatedAlias { fn parse(input: syn::parse::ParseStream) -> syn::Result { let alias = input.parse()?; let mut since = None; let mut note = None; let mut doc = None; macro_rules! replace_str { ($x:ident) => { if $x.is_some() { return Err(Error::new(input.span(), format!("Duplicated `{}`", stringify!($x)))) } input.parse::()?; $x = Some(input.parse::()?.value()); }; } while !input.is_empty() { input.parse::()?; if input.is_empty() { break; } let keyword: Ident = input.parse()?; if keyword == "since" { replace_str!(since); } else if keyword == "note" { replace_str!(note); } else if keyword == "doc" { replace_str!(doc); } else { let msg = format!("Unrecognized keyword: {}", keyword); return Err(Error::new_spanned(keyword, msg)); } } Ok(Self { alias, since, note, doc, }) } }

PS: Add my macro on previously deprecated methods would result in conflicts (error: multiple `deprecated` attributes): "step map_results foreach fold_results fold1". But maybe we would want to remove some/all of those.

jswrenn commented 1 year ago

I just drafted a document reviewing the situation: https://hackmd.io/@jswrenn/HyKLk_1Cn

Before we embark on a renaming on our end, I think I will at least try to implement RFC2845.