crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.37k stars 1.62k forks source link

FileUtils: refactoring #5231

Open j8r opened 6 years ago

j8r commented 6 years ago

What are your thoughts about this module?

Ideas are to merge the methods to the File and Dir Objects, and replace the _r by a boolean recurse argument.

Sija commented 6 years ago

in any case please use recursive over recurse.

j8r commented 6 years ago

To sum up, the propositions for FileUtils:

For recursiveness, others options are to create overloads recursive : Bool = false for each method, or create a method that yields all sub files/directories.

RX14 commented 6 years ago

FileUtils must go.

Ideally a File instance would represent an opened file, and a Dir instance would represent an opened dir, both would include the Path module which represented operations on paths. Or Path can be a seperate class. We can then work out what static methods we want based on that completely refactored core where everything can be done in terms of Path instances.

I've used java's Path class and it truly was refreshing.

j8r commented 6 years ago

@RX14 Ok, I've nothing against FileUtils, but there are inconsistences. e.g. there is a File.delete but no Dir.delete, but there is no File.unlink but a Dir.rmdir.

So anyone has an idea to replace this not-so-nice _r ? I have opened a PR to implement a chown_r and a chmod_r, but after discussion maybe we can consider an alternative for recursive action.

Anyway, why keeping aliases?

RX14 commented 6 years ago

@j8r What aliases?

With my idea we'd just obviate the recursive by making it as easy as path.each_child(recursive: true, &.delete)

j8r commented 6 years ago

@RX14 The actual aliases that point to File or Dir are: FileUtils.cd, FileUtils.mkdir(_p), FileUtils.mv, FileUtils.rm, FileUtils.rmdir, FileUtils.pwd and FileUtils.touch

Your way is elegant, but with it we will need to handle the directory itself separately.

sevk commented 6 years ago

why not make FileUtils to shard ?

ysbaddaden commented 6 years ago

@j8r the FileUtils module is for short, shell-like commands, and offers a nice touch in some scenarios, for example helpers in a test suite, or a build tool: include FileUtils, execute some short commands. It helps avoid surrounding noise.

We said before that:

But nobody tackled this.

@RX14 smart & elegant... but a headache to find and remember. I prefer a plain stupid recursive = true parameter:

class Dir
  def self.delete(path, recursive = false, force = false)
    if recursive && exists?(path)
      each_child(path) { |child| Dir.delete(child, recursive, force) }
    end
    unless LibC.rmdir(path) == 0
      return if force && Errno.value == Errno::ENOENT
      raise Errno.new("rmdir")
    end
  end
end

A Path abstraction could be nice, too, but maybe as a complement to this issue (make FileUtils optional, harmonize the API of File/Dir namespaces).

RX14 commented 6 years ago

FileUtils shouldn't be required to do anything. Therefore FileUtils becomes a module full of aliases - and therefore we should remove it entirely. It's ugly and it's unneccesary. Lets have one API for file manipulations, and lets make it clean and crystal-like, not ugly like FileUtils.

Sija commented 6 years ago

@ysbaddaden @RX14 assuming we have Pathname floating around, what would you said should be the return type of File#path, Dir#path or similar? What about path argument to Dir.cd(path), should it be String or Pathname?

ysbaddaden commented 6 years ago

I would hate to be required to wrap a String in a Path or Pathname struct just to use it with Dir or File methods.

@RX14 you don't like FileUtils, fine. I like it, and I consider that it does provide a nice, verbose-less API that helps focusing on actual commands not crystal code, and that it's useful in many scenarios (e.g. script-like programs). Having to install/require an external shard for that would be very inconvenient. This ain't just "ugly" aliases.

RX14 commented 6 years ago

@Sija #path would return Path, and since we're crystal we just make most methods that currently take a path, take either String or Path.

straight-shoota commented 6 years ago

I've just stumbled over #4096 which introduces a nice addition (creating temporary directories) but the PR got stalled because it depends on FileUtils.rm_r to clean up - and FileUtils shouldn't be required by Dir. In it's predecessor #2911 @ysbaddaden already suggested Dir.delete(path, recursive = true) and I think this would be good way to start.

Even simpler, when you have a File (or Dir) object and want to remove it, this is currently done by: File.delete(file.path). Isn't Crystal supposed to be object oriented...? 😆

straight-shoota commented 6 years ago

It seams we all agree that organizing path-related operations in Path is a good idea.

I think the current way of handling paths as strings is really great in many ways (though certainly not all), because it's simple and you can directly manipulate them. For method arguments, both String and Path can easily be combined through overloading / accepting union types, similar as many HTTP methods for example accept both URI and String. So @ysbaddaden shouldn't be "required to wrap a String in a Path or Pathname struct just to use it with Dir or File".

A bit more challenging are return types because you can't overload them, so all methods returning a path value would always return a Path. However: while it would be easy to convert it to a string, it is still an additional step that might make things a tiny bit more complicated than today. Another example are the dir iterators: it makes probably sense to have a variant that returns all children as (full) paths, but also a variant that only returns the names (presumably as string). So we'll need two variants of these methods. In some places there are a few drawbacks, but I guess that's the price to pay.

RX14 commented 6 years ago

@straight-shoota I think that there's very few times where String isn't already treated as a black-box object (meaning that no String operations are called on it, like Path), apart from the case of literals. The benefit of accepting String is only for the case of literals or converting to a path from IO. All path manipulation should be done by File methods already, encapsulating to a Path simply helps enforce that.

j8r commented 5 years ago

I would like to update this issue with a proposed roadmap.

The goal is to include missing feature provided by FileUtils to Dir and File, in order to make it optional.

@RX14 proposal can be implemented by walking through directories. If Path#walk existed, one can recursively create missing subdirectories:

Path.new("/aaa/bbb/ccc").walk do |path|
  Dir.create path, force: true
  #=> /aaa/
  #=> /aaa/bbb/
  #=> /aaa/bbb/ccc
end

If Path#absent(&block : Path ->) (yield Path if absent) and Path#create_dir existed, it could even be Path.new("/aaa/bbb/ccc").walk &.absent &.create_dir

Copying recursively is another story.

Changes:

Don't know about FileUtils.cmp, maybe moving it to File.compare?

Any suggestion?

straight-shoota commented 5 years ago

Path#walk would be Path#each_parent which already exists.

rubyFeedback commented 5 years ago

For what's its worth, I find Path to be a significantly worse name than File and Dir.

One can say that it uses an abstract concept, so may fit better than File and Dir, but I knew it from ruby where I just absolutely hate Pathname.new; I refuse to use it.

I am fine with Dir. and File namespace, and in ruby I have no huge problem with FileUtils (although I agree that it is a bit strange ... for example I can use File.delete() and don't have to require fileutils specifically, in ruby .. this is a bit awkward IMO).

I can not give a substantial opinion for crystal or how the API should be, but I dislike too abstract concepts - File. and Dir are very simple for me to understand, conceptually. This is why I agree with ysbaddaden (although I have no explicit opinion on the recursive option; to me recursive is somewhat decoupled from the discussion about File, Dir, FileUtils, and path; it's just a parameter right? So not that important to me personally in general).

straight-shoota commented 5 years ago

Path is not intended to replace File and Dir. They have different purposes. And the discussion about refactoring/removing FileUtils is also only loosely related to Path.

bcardiff commented 5 years ago

I have two thoughts:

j8r commented 5 years ago

Why having to pass Path as argument? Couldn't we have something OOP like in Java Path#to_file (an object for FileSystem operations)?

straight-shoota commented 5 years ago

@j8r: File.open(path) is OOP and it's a much cleaner API than replicating File's class methods as instance methods on Path.

j8r commented 5 years ago

Nope @straight-shoota, I havent said creating Path::File. It would be nice interoperability to have Path#to_file ::File, then being able to Path["log"].to_file.touch for instance. Then, having also File#path : ::Path, too

straight-shoota commented 5 years ago

I didn't understand you'd be suggesting a path-based File implementation. Yet, Path["log"].to_file.touch looks like that. There is no instance method File#touch. It's a class method and your example would be File.touch(Path["log"]) (or simply File.touch("log")). In any case, that's exactly what I meant in my previous comment and what I'd like to avoid: Replicating (or moving) the class method interface of File to either Path or File instances. It just gets convoluted. The division between File's class and instance API is great as it is.

stellarpower commented 2 years ago

Adding here, I feel like there is room for some module that provides more shell-like utilities that are quick one-liners. I mostly use Ruby for scripting, and this is how I plan to use Crystal, for info, so these would be useful for me but may be more like pollution to a systems programmer. But, perhaps it would make more sense to move some of FileUtils to more like a replacement for/equivalent of some common shell/binutils commands, and implement any missing functionality in core classes.

As duplicated in the above linked issue, sometihng like cat would be very neat for me - i.e. FileUtils::cat (or as some other module) as a one-liner to read in the contents of a file as a string. head and tail could also then take lines as usual. This could well be better in a shard, I would have no problem with that myself. I remember using a gem called something like ShellUtils; so if folks above feel that the core OO-based classes need more features, then I wouldn't disagree with merging core features into those and splitting the more utility-like, shell-flavoured, unix-smelling methods somewhere separate, for when we want a quicker, possibly less clean, but still useful one-liner.

Blacksmoke16 commented 2 years ago

sometihng like cat would be very neat for me - i.e. FileUtils::cat

Cross posting for visibility as well. This already exists in the form of File.read. I also doubt we'd want to add cat as an alias, given that's something Crystal tries to avoid.

stellarpower commented 2 years ago

Maybe I have had to do too much Python at work, because I wasn't aware that read just returned a String. In Ruby I've seen too many things like this where it's read in a block, or using readline which is just too verbose for many uses where I just want the whole text as a string. So fair enough cat is redundant in this case.

But point being I think some of these utilities could be quite useful. Personally the aliases are a feature I like about Ruby a lot. I just lost ten minutes trying to find out what collect is called, because I had forgotten inject, and obviously at this stage there isn't much mention of the language online. So it's hard to find things via google that aren't about angelic healing etc. But it seems you don't want to go that way, so fair enough. I think though that file-like utilities would make sense as a separte shard, if the aim is to keep the stdlib cleaner, more OO, minimal, and without aliases, my "two cents" would be towards something like that.

Blacksmoke16 commented 2 years ago

In Ruby I've seen too many things like this where it's read in a block

Yea, File.read just wraps that block into a higher level API. Are quite a few methods like that. I.e. that exist as a class method but are implemented via opening the file and doing something. There are valid reasons for both. E.g. File.read_lines versus File.each_line where the former would load the entire file contents into memory and return the lines in an array, while the latter yields one line at a time. Which can be helpful for larger files where you don't need all the lines at once.

So it's hard to find things via google that aren't about angelic healing etc

Works a lot better if you search for crystal lang .... Keeping the API docs open and just going to the related type and ctrl + f or using the search there can also be helpful.

But it seems you don't want to go that way, so fair enough

Related: https://github.com/crystal-lang/crystal/wiki/FAQ#why-are-aliases-discouraged

stellarpower commented 2 years ago

Yeah, without looking it up I'd have expected File#read to return an IO or something. I have to jump between multiple languages a lot of the time so I always forget the basics. The fine points I will remember until I die, but reading CSV or JSON, which I am doing every day, and I just Google it. It all gets polluted with each different way to do things. That's why personally I love both the flexibility and the aliases - less to remember and (like read above) "It Just Worked" for me is helpful over having to fiddle with something slightly.