fsspec / filesystem_spec

A specification that python filesystems should adhere to.
BSD 3-Clause "New" or "Revised" License
1k stars 354 forks source link

Type annotations? #625

Open evinism opened 3 years ago

evinism commented 3 years ago

Are you interested in type annotations? E.g. if I submitted a pull adding a bunch of type annotations to the project, is that something that you'd see being easily merged?

martindurant commented 3 years ago

Personally, I wouldn't say that I am "interested", but I am also not opposed. Call that ambivalent maybe. I suspect it may be a great deal of work, and I am uncertain if, in the long run, it would entail more or less work for further development of the library. I am prepared to be convinced, and perhaps more voices should speak up here.

aucampia commented 1 year ago

Thanks for creating fsspec, it really is great. I really would like to see type hints for it, so I will make what case I can.

If you start off with mypy in non-strict mode it makes it fairly simple to do gradual typing, adding types to one function at a time. mypy will likely warn about some things in your code base even without you adding types, but fixing those won't be that much work. Once mypy is integrated with your CI you can add typing to parts, probably just adding it to fsspec.core would be a good next step.

As you add it to more of your code base you will likely encounter some type errors, one approach is to try and just add # type: ignore[...] comments for those. Generally keeping pull requests to only type-hint pull request works best. The # type: ignore[...] comments can be seen as fixmes, and dealt with later. I made a script for RDFLib that I use to diff two commits by removing all type hints and comments [ref]. This makes it very simple to ensure there are no runtime changes sneaking in with adding of type hints.

Of course once you have type hints in place it makes it easier to spot some categories of errors easier, and it makes it easier for users of fsspec to use fsspec correctly, as now a lot of potential mistakes of using fsspec become type errors.

Another benefit of type hints however is that sphinx can use them when generating documentation so eventually you can move the type specifications for parameters out of the docstrings.

As for maintenance burden, If people were okay adding docstrings, then this is not really that much more of a burden, or actually no additional burden at all. Some potential contributors may be putt off by it, and sometimes contributors may find the type hints a bit pendatic. mypy will complain about things which are not bugs in any meaningful sense of the word even if they are type errors, but I see no good reason to not eliminate all type errors even if they have no potential for impacting users now, they could always impact users when the codebase change.

martindurant commented 1 year ago

cc @ianthomas23

ianthomas23 commented 1 year ago

Thanks @aucampia, you present a strong case and a potential roadmap for starting this work. It is also evident that many people would like type annotations. No-one should be dissuaded by any lack of interest shown here, if a PR was submitted starting this work with some commitment to continue it, I would expect it to be merged without too much hassle. Or to put it another way, given that the python ecosystem is going in this direction I think it is inevitable that we will have to adopt type annotations eventually, it is just a question of when.

The problem really is that even if a large number of people request this then it makes no difference, what is needed is someone to actually do the work. It isn't a priority for established contributors, most of whom are inclined to fix problems that are impeding them, or add new functionality that they need. Maintainers with a wider view of the project are mostly working on consistency and reliability across the code base. Infrastructure issues like this often do not get near enough to the top of anybody's "todo" list to actually be actioned. There is a space here for a new contributor with type annotation experience to fill and if that doesn't happen probably an existing maintainer will eventually get around to it but I wouldn't expect it to be soon.

aucampia commented 1 year ago

The problem really is that even if a large number of people request this then it makes no difference, what is needed is someone to actually do the work.

I agree with this and I understand it, it is significantly important to have fsspec be useful and defect free. I hope someone like @evinism finds the time to make this contribution. And hopefully whoever does have the time to do this will take care to do it with incrementally with minimal disruption.

edgarrmondragon commented 4 weeks ago

Would it be alright to get a list of modules, then as a first step configure mypy to ignore all of them and start tackling individually in separate PRs. I'd be happy to help with that as much as possible, but I wonder if that's too big of a burden on the maintainers.

martindurant commented 3 weeks ago

Having a mypy config which ignores all files would be fine with me, if it allows others a more obvious route to adding types to the codebase.

I believe there is a lot of duck-typing in this package, so I am not sure how easy it will be to get full typing coverage, but adding annotations to obvious places is probably a useful step. (example in the last five minutes)