kenkov / kovot

Python Chatbot Framework
MIT License
4 stars 1 forks source link

Add a kovot stream for Mastodon #4

Closed kazh98 closed 5 years ago

kazh98 commented 5 years ago

This is an implementation of kovot Stream for running kovot on the Mastodon. This patch requires Mastodon.py.

Thank you for considering my proposal for your great product, kovot.


Corresponding Mastodon account: @risa@social.arnip.org.

kenkov commented 5 years ago

I had a mistake to close this amazing PR, so I reopened it.

Thank you for your code for Mastodon Stream. Can I ask you to review your code from the viewpoint below?

Fist, Stream iterator should return Message object. Your code doesn't seem to do so, could you check it?

Second, to work with mods, the argument response of Stream.post should not expect a MastodonRespose object. For example, Resepone object doesn’t have an api attribute.

Let me apologize for lacking documentation about creating new Stream. I wrote some documentation in Japanese in README. If you can read Japanese, could you check it?

kazh98 commented 5 years ago

Thank you for writing detailed information into README. I'll try to revise my pull request and reply to all of your comments as soon as possible.


Post script in Japanese: 日本語話者なので、README を含む日本語文書の読み書きについては問題ないです。私のpull request にアドバイス頂きありがとうございます:-)

kazh98 commented 5 years ago

Dear kenkov-san,

I modified my pull request for making the Mastodon stream compatible with kovot stream. So, could you check my modification?

However, this change loses some needed information such as a list of accounts related to given toot. Hence, in practical use, we can't make a bot which replies to a given, mentioned toot. However, fixing this problem requires some destructive extensions of the Message class, like making it containable some meta information. Because this modification is out of the scope of this pull request, I would like to cope with it as other pull request or to write it as an issue.

In other words...

I'm sorry for late correspondence.

Best Regards, Kazuhiro Hishinuma.

kenkov commented 5 years ago

Thank you for your modification about Mastodon stream!

I got it about the problem that current Stream doesn't have ability to get additional information in the original message from Mastodon. I have some idea to solve it, so I will mention it in the issue you created; thank you for creating the issue!

By the way, I tested your code and found that the text attributes of Message and Response are like below.

<p><span class="h-card"><a href="https://HOST/@username" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>username</span></a></span>テスト</p>

Is this the behavior you expected? It is desirable the text argument of Message and Response objects is an utterance text, like テスト in the above example. If my test seems to be something wrong, please tell me about it.

kazh98 commented 5 years ago

Thank you very much for testing my codes and for giving me your comments.

May I implement a procedure to remove html tags appearing in the text attribute?

From the Mastodon API specification, plain text is not available for content from remote servers (see https://docs.joinmastodon.org/api/guidelines/#formatting). So, we have to assume that raw data given through Mastodon API contains <p>, <br>, <span> and <a> tags (see https://docs.joinmastodon.org/api/guidelines/#html-tags). This implies that we have to cleanse data given through Mastodon API if we want plain text.

A way to cleanse data is removing <p>, <br>, <span> and <a> tags appearing in given data. So, I think we can easily implement it. However, this modification break the correspondence between data of text attribute and raw data from Mastodon API. I would like your opinion and advice about this problem before I start to implement this cleansing procedure.

Sincerely yours.

kazh98 commented 5 years ago

i.e. Mastodon API からのデータにはどうしてもHTML タグが入ってしまうので、Kovot 側(Mastodon クラス内) でHTML タグを除去する実装を入れてもよいでしょうか?

kenkov commented 5 years ago

ご確認ありがとうございます!

HTMLタグの除去について、Mastodonクラス内で対応していただいてよろしいでしょうか。 ModMessage.text に発話文字列を期待して動作することを想定していますので、HTMLが入っていると動作しない Mod が出てきてしまう可能性があるためです。

Mastodon のrawデータとtextの対応付けができない点について、Stream クラスは raw データと Message との間を変換するアダプターの動作を期待していますので、そのような実装で問題ないです。

kazh98 commented 5 years ago

I see! And I'll try it! Thanks.

kazh98 commented 5 years ago

I implemented the procedure _TootListener._cleanse_html which cleanse HTML by using html.parser.HTMLParser. I think it behaves desirably.

Could you check it?

kenkov commented 5 years ago

Thank you for dealing with HTML tags! I checked your final PR which completely works, and merged master branch.