Anomalocaridid / handlr-regex

Fork of handlr with support for regex
MIT License
129 stars 5 forks source link

Wrong type associaion on html files #56

Closed octvs closed 5 months ago

octvs commented 8 months ago

handlr associates some html files as text/plain, which occurs while I render html e-mails . Related upstream issue.

minimal example:

$ echo "<p> Hello World! <a href='https://google.com'></a> </p>" > example.html
$ file --mime-type example.html
example.html: text/html
$ cat example.html                                                                              
$ handlr mime example.html
┌──────────────┬────────────┐
│ path         │ mime       │
├──────────────┼────────────┤
│ example.html │ text/plain │
└──────────────┴────────────┘

In case it is relevant, in the following example results are consistent:

$ echo "<p> Hello World! </p>" > example.html
$ file --mime-type example.html
example.html: text/plain
$ cat example.html                                                                              
$ handlr mime example.html
┌──────────────┬────────────┐
│ path         │ mime       │
├──────────────┼────────────┤
│ example.html │ text/plain │
└──────────────┴────────────┘
octvs commented 8 months ago

There seems to be a commit mentioning the issue on the aforementioned thread, which leads to another fork, probably relevant to this discussion.

Anomalocaridid commented 8 months ago

I can replicate the bug on my end with handlr-regex v0.10.0.

xdg-mime query filetype gives text/html, so it's not just "expected" behavior. Also, the commit you linked appears to be cherry-picked from this repo, and the other fork appears to still have this issue.

I'll take a look at it later and see what I can do.

Anomalocaridid commented 7 months ago

Okay so I took a look at it a few days ago. I think the culprit might be some of the xdg_mime crate's behavior for mimetype detection. Unfortunately, some bugs that lead to inconsistent behavior in get_mime_types_from_file_name() make it difficult to work around it without causing other issues, and I cannot find any alternative crates that work exactly right.

Just to be clear, fixing this is still a priority. I just figured I'd give an update of where I'm at in regards to figuring out a fix. Worst-case scenario would just be handling html files as a special case, which isn't ideal, but not a big deal.

Anomalocaridid commented 7 months ago

I made a PR xdg_mime to fix this: https://github.com/ebassi/xdg-mime-rs/pull/28. It has already been merged, so when this gets fixed will just be a matter of how long it takes for a new release to come out.

Anomalocaridid commented 5 months ago

Sorry for taking so long with this. A release for xdg-mime was made a while ago, but there was another, unrelated pull request I made I was waiting for. I decided to just fix this rather than wait any longer because I understand how it feels to wait for a bug to take a while to get fixed.