Closed s4zuk3 closed 2 weeks ago
Thanks Francisco for putting the effort to make this pull request. It's big feature and will require upgrading the 0.2. I was planning to do this myself but It's great to see that you jumped on it.
I'll need sometime to look into this.
Is it ok if I make changes straight to your branch?
Thanks Francisco for putting the effort to make this pull request. It's big feature and will require upgrading the 0.2. I was planning to do this myself but It's great to see that you jumped on it.
I'll need sometime to look into this.
Is it ok if I make changes straight to your branch?
Sure, please make any necessary changes! Thank you very much!
@nmammeri Hey! is there anything missing or can I help you with something to merge this PR?
Sorry I didn't have much time to look into it. Thanks again for you work.
Vec<String>
and not a HashMap<String,String>
on the Tika side the metadata is also a Map. I strongly think Map is better. It would be great if you can change the Rust side to use HashMap<String,String>
and Python side to use Dict. type Metadata = HashMap<String,String>
JStringResult
. We have 2 return types JStringResult
where all the file is parsed into a string and JReaderResult
where a stream is provided and can be read by the user. JReaderResult
is useful if you don't know the max size of your content or if you want to do buffered reading.JReaderResult
but let's try to make it work for JStringResult
first.Please let me know if you can make those changes. Many thanks
I will create a new updated PR.
Hello! This is my first time contributing to a public repository, and it’s also my first time using Rust. I hope you find the changes satisfactory. I found your repository very interesting, as I use Tika a lot but I don't like having to depend on Java for its use.
I need the metadata that Tika provides, so I made the necessary changes to implement it in this repository, trying to modify as little as possible. From what I understood of the code, the metadata was already being delivered, it just wasn't fully captured and passed through the binding, so theoretically, the performance should remain the same.
Thanks!