apache / opendal

Apache OpenDAL: access data freely.
https://opendal.apache.org
Apache License 2.0
3.34k stars 472 forks source link

HTTP service support list operation #3688

Open kebe7jun opened 11 months ago

kebe7jun commented 11 months ago

OpenDAL is a fantastic project, and I am attempting to integrate it into my system. However, I've noticed that the current HTTP type file system does not support the list operation, which poses significant difficulties for us. We are unable to use commands like cp -r to operate on the HTTP file system. The page structure for static files on servers like Nginx or Apache Server is quite similar, as seen on pages like https://mirrors.tuna.tsinghua.edu.cn/apache/hbase/stable/. It seems feasible to support HTTP list operations by parsing specific HTML structures. To my knowledge, platforms like Alluxio also support this kind of operation. I couldn't find any related information in the issue section, so I wanted to ask if there is any reason we are not planning to support this? Are there any concerns? Or, I am willing to help the community accomplish this task. Would the community be open to accepting my contribution?

Xuanwo commented 11 months ago

I couldn't find any related information in the issue section, so I wanted to ask if there is any reason we are not planning to support this? Are there any concerns?

We don't support list operations due to the lack of established standards. Parsing unknown HTML can be challenging and prone to errors. I'm not sure should we add such logic inside http service.

platforms like Alluxio also support this kind of operation.

Would you like to provide some references?

kebe7jun commented 11 months ago

https://docs.alluxio.io/os/user/stable/en/ufs/WEB.html

Alluxio provides examples of mounting a file system based on the web (HTTP) protocol.

Xuanwo commented 11 months ago

Seems alluxio.underfs.web is not part of open source alluxio

kebe7jun commented 11 months ago

rclone also do that...

https://github.com/rclone/rclone/blob/9061e8185054a713f0c218b0e33e64993f19a514/backend/http/http.go#L382-L412

Xuanwo commented 10 months ago

Hello @kebe7jun, do you think it would be beneficial for an HTTP service to accept a closure?

For instance:

let cfg = services::Http::default();
cfg.index_parser(my_parser);

...