Flying-Toast / yugo

Yugo is an easy and high-level IMAP client library for Elixir.
https://hex.pm/packages/yugo
MIT License
42 stars 7 forks source link

Add calls to process headers as well #19

Closed ponychicken closed 7 months ago

ponychicken commented 8 months ago

For an archiving project I needed to be able to access all the headers of the mails as well. This commit adds the string containing the original headers to the messages

Flying-Toast commented 8 months ago

Can you please update the tests too? We should also add some new tests specifically for headers.

ponychicken commented 8 months ago

I have fixed the line and added two tests to the parser

Flying-Toast commented 8 months ago

Did you push? It doesn't look like the commits in the PR updated.

tzugen commented 8 months ago

Ah yes the push failed and I closed my laptop. Its there now.

Flying-Toast commented 8 months ago

Great. Just a few more things: 1) Let’s update the Yugo.email typespec to include the new field - I also think we can name the field simply “headers” instead of “rfc822_header” 2) I see you updated the expected client messages in the tests to expect the client to request the RFC822.HEADER field, but it still doesn’t have the server send them back. You should change the S: … lines so that they include a variety of sample headers and then ensure that the email that is received actually contains those headers. 3) I’d like us to parse the headers into a more elixir-y data type (probably an array of {name, value} tuples). This might be as simple as splitting each line on “: “ but check the IMAP spec to see what exactly it would entail.

Thanks again for the work so far!

ponychicken commented 7 months ago

Ok, 1 & 3 are done. Regarding 2 I am not sure how to proceed.

A sample response might be this: S: * 2 FETCH (FLAGS (\sEEn) BODY ("text" "plain" ("charset" "us-ascii" "format" "flowed") NIL NIL "7bit" 47 6) RFC822.HEADER {1992}\r\nReturn-Path: <email@example.com>\r\n ENVELOPE ("Wed, 07 Dec 2022 18:02:41 -0500" NIL (("Marge Simpson" NIL "marge" "simpsons-family.com")) (("Marge Simpson" NIL "marge" "simpsons-family.com")) (("Marge" NIL "marge" "simpsons-family.com")) (("HOMIEEEE" NIL "homer" "simpsons-family.com")) NIL NIL NIL "fjaelwkjfi oaf<$ ))) \""))

Right now the assert_comms are wrapped in ~S sigil, which means it will double escape the line breaks. But if I use the ~s, to keep the linebreaks in the header sample, assert_comms will break, because it splits on \n and suddenly there will be \r\n in the middle of the string.

Flying-Toast commented 7 months ago

I'm not sure I understand the issue. assert_comms() replaces all "\n" with "\r\n" then splits on the regex ~r/\r\n[CS]:/. You should be able to put literal line breaks in the string, as is done here. Unless I am missing something - did you try that and it didn't work?

Flying-Toast commented 7 months ago

Also, headers aren't something that most users are likely to need. I think we should consider adding a mechanism to configure whether or not we want to fetch headers so we avoid downloading excess information from the server. I'm thinking we could somehow extend Filters to do this. Do you have any ideas?

ponychicken commented 7 months ago

I'm not sure I understand the issue. assert_comms() replaces all "\n" with "\r\n" then splits on the regex ~r/\r\n[CS]:/. You should be able to put literal line breaks in the string, as is done here. Unless I am missing something - did you try that and it didn't work?

Yes i tried this and it doesn't work. The problem is that both the syntax of the Client and Server response depends on linebreaks for parsing, but also inside the server response there will be linebreaks if you want to add headers, so when assert_comms does the splitting the headers end up being split.

Regarding the other things am not sure i can do this, neither do I feel so good about it. I wanted to share my work so that others can profit from it. But I personally don't need the headers parsed for example, just need to archive them. But it seems like the scope of this is growing and growing without clear indication where it goes.

Flying-Toast commented 7 months ago

it seems like the scope of this is growing and growing without clear indication where it goes.

Sorry, didn't mean to make you feel like you were doing endless work - I guess I should've laid out everything from the start. I'm still pretty new to the idea of other people actually sending in PRs for my projects so I'll try to be better about that in the future.

The Filter configuration was just something to think about, not any sort of requirement. But I still would want the headers to be fully parsed rather than just exposed as one big string - the whole idea of the library is to abstract away all the details of IMAP internals so that users don't need to know anything about the format/protocol.

If you aren't interested in completing this that's totally fine, I can see if I have some time in the next few weeks to work on this to the point where I'm comfortable including it in the library.

And thank you so much for all this work that you've already done, I really do appreciate it :-)

ponychicken commented 7 months ago

Ok, thanks. I will have a look at it again next week

Flying-Toast commented 7 months ago

Ok. I'll close the PR for now.