ocaml / v2.ocaml.org

Implementation of the ocaml.org website.
http://ocaml.org
Other
323 stars 346 forks source link

Filter email list to not show `unsubscribe` emails. #1513

Open gs0510 opened 3 years ago

gs0510 commented 3 years ago

On the https://ocaml.org/community/ page, the recent email threads show all emails sent to the list. Filter the list so that unsubscribe emails are not displayed.

Screenshot from 2021-04-13 13-37-16

gs0510 commented 3 years ago

@Ndipbanyan since you were looking for a medium issue, you can go ahead and work on this one.

Ndipbanyan commented 3 years ago

@gs0510 Alright. Thank you. I will begin working on it and reach out for any help or clarifications that I might need.

gs0510 commented 3 years ago

@Ndipbanyan Have you been able to make any progress? Do you have any questions? Thanks!

Ndipbanyan commented 3 years ago

@gs0510 I have been able to find the code that generates this list in the rss2.html in the script directory and I am trying to understand the function that does that to see if I can modify it to filter the list. So the drawback I am currently having is my little to lack of understanding of the Ocaml language. However, I am still going through tutorials to catch up.

gs0510 commented 3 years ago

okay, let me know if you run into any problems! Thanks!

Ndipbanyan commented 3 years ago

@gs0510 So I came up with a solution and want to clear be about it before creating a PR. Let me try to explain- The api that is 'consumed' to display the emails in recent thread emails returns a result having items in which each item has a title tag which reflects the subject of each email and the email of the sender. Below is what I am referring to

Screenshot 2021-04-18 at 17 55 55

generated from https://sympa.inria.fr/sympa/rss/latest_arc/caml-list?count=40

Looking at the above, you will notice that the item with the title <title>[Caml-list] - ulugbekna@gmail.com</title> has its email subject as "[Caml-list]", item with title <title>[Caml-list] [CFP] Logical Frameworks and Meta-Languages: Theory and Practice - enrico.tassi@inria.fr</title> has its email subject as "[Caml-list] [CFP] Logical Frameworks and Meta-Languages: Theory and Practice" and item with the title <title>[Caml-list] unsubscribe - jean-denis.eiden@orange.fr</title> has "[Caml-list] unsubscribe" as its subject. Now in the code base in the /script/rss2html.ml, line 595 contains a regex expression that is written to exlude "Re:" and anything in between [ ] which was used to match the subject(represented in between the tags). Doing this results to the [Caml-list] and [CFP] removed from the above "titles" leaving only the remaining part of the titles to be displayed. so in the case of <title>[Caml-list] - ulugbekna@gmail.com</title>, there isn't any title after the [Caml-list] has been replaced/removed so the email - ulugbekna@gmail.com is displayed. Going by all these, my implementation added the unsubscribe to the regex which will end up displaying <title>[Caml-list] unsubscribe - jean-denis.eiden@orange.fr</title> as "- jean-denis.eiden@orange.fr" in recent thread emails.

This has become rather too long :). However, the point of all my explanations is to be sure if my implementation is the way it should be or you mean an entirely different thing. Thank you for taking time in helping me with this.

gs0510 commented 3 years ago

HI @Ndipbanyan! You are almost right :) We don't want to display the threads that say unsubscribe on the email feed and not remove unsubscribe from the title. What the function normalize_title is just normalizing titles (so removing the [CFP] etc etc.). What we want to do is remove the unsubscribe post from the posts list, so you can parse the list to see if there's a post with unsubscribe in it's title and remove that from the list. Hope this helps!

Let me know if anything is unclear, or if there's anything OCaml related that you don't understand :)

Ndipbanyan commented 3 years ago

Thank you @gs0510 for the clarity. I will look into implementing this and let you know when I run into any issue understanding anything. Thanks

Ndipbanyan commented 3 years ago

@gs0510 I have been having issues in trying to run make or make production since I installed the ocaml platform extension on vscode. Below was the error I was getting

Screenshot 2021-04-23 at 07 50 09

I uninstalled the extension then the cohttp-server-lwt ./ocaml.org wouldn't start anymore and running make gives the below error

Screenshot 2021-04-23 at 08 50 02

Please can you help me detect what the problem is?

gs0510 commented 3 years ago

@Ndipbanyan Both errors are related to omd. Can you run opam show omd to see what version of omd you have?

cohttp-server-lwt ./ocaml.org will work only if your make command is successful.

Ndipbanyan commented 3 years ago

After runnning opam show omd I got this

Screenshot 2021-04-23 at 12 12 03
gs0510 commented 3 years ago

The website doesn't work with the latest version of OMD, see issue #1321, you need to downgrade omd to 1.3.1 and it should be okay after that :)

Ndipbanyan commented 3 years ago

Yes! It works now. Thanks. Got me stuck there for a while.

Ndipbanyan commented 3 years ago

Also I think I have been able to filter the emails now. My implementation is thus:- I wrote a regex (for the unsubscribe word) and added an else if block in the must_keep function to exclude any post whose title matches the regex. Is this implementation okay?

Before:

Screenshot 2021-04-23 at 15 01 24

After:

Screenshot 2021-04-23 at 15 02 39

Code snippet (lines 592 and 614)

Screenshot 2021-04-23 at 15 05 46
gs0510 commented 3 years ago

This looks good @Ndipbanyan, you can make the regex case agnostic so that all kinds of unsubscribes are filtered out. You should also open a PR. :)

Ndipbanyan commented 3 years ago

Great! I've opened a PR. I used Str.regexp_case_fold as opposed to just Str.regexp so I believe that makes it case agnostic.