kittenswolf / WikiTextBot

reddit.com/u/WikiTextBot
GNU General Public License v3.0
72 stars 10 forks source link

Code cleanup and extensibility preparations #2

Closed allemangD closed 7 years ago

allemangD commented 7 years ago

Main changes

New Modules

PersistentList

It's not super efficient, but it's reasonable. Could probably do for some in-memory caching features to prevent so many file lookups

messageutil.py

I included documentation and examples in this file, since it's radically different from the previous system. I tried to build on the idea of the footer_links list to let the entire footer be defined in that way. I also expanded it to include the PM responses for users being blocked or included, and allow messages to be built on each other.

I also created the start to a system to get the intent of a user's PM or reply, currently 'ignore_user' and 'include_user'.

The three of these systems now work together, and it should be easier to add more user interactions.

File Changes

+ bot_blacklist.txt
+ messageutil.py
+ persistentlist.py
+ requirements.txt
- sentences.py

bot.py changes

Renamed some functions to better represent their purpose

Removed unused functions and attributes:

- comment_threshold
- num_sentences
- normal_chars
- media_extensions
- image_extensions
- intro_wikipedia_link
- category_wikipedia_link
- get_thumnail
- locateByName
- enter_bot

Replaced some functions and attributes with features in messageutil:

- replace_right
- generate_footer

Replaced some attributes with persistentlist:

- get_cache
- input_cache
- check_excluded
- excludeuser
- includeuser
- get_bot_list
- check_bot

msg_cache_filemsg_cache
cache_filecom_cache
user_blacklist_fileuser_blacklist
bot_list_filebot_blacklist

Altered wikipedia fetch functions:

All wikipedia fetch functions entirely rely on the wikipedia module. This means, however, that retrieving extracts with a sentence-cutoff is impossible. Instead, the entire first paragraph of a section is used. Most articles have rather small paragraphs, and in my tests I haven't noticed too much difference to the current /u/WikiTextBot functionality, so I think the readability improvements outweigh that downside. The combined length of get_wikipedia_links and get_wiki_text has dropped from 111 lines to 76, with most of that gain in the duplicate code in get_wiki_text.

allemangD commented 7 years ago

I realize this is a very large pull request, and I think that warrants some explanation. I had a couple ideas related to text formatting and more detailed analysis of the parent comments, but I was struggling to find logical places to start implementing those features.

I thought it would be more useful to try to improve the current program rather than implement those things around what's there, at least since the bot is fairly small.

Sent from my Verizon SM-N910V using FastHub