ClementJ18 / moddb

A Python scrapper to access ModDB mods, games and more as objects
https://moddb.readthedocs.io
MIT License
13 stars 8 forks source link
dbolical moddb python scrapper

ModDB Reader

Library is now stable

The goal of the library is to be able to navigate ModDB purely programmatically through scraping and parsing of the various models present on the website. This is based off a command of my bot which can parse either a game or a mod, this command gave birth to the original library which was extremely limited in its abilities and only able to parse a few pages with inconsistencies. This library is a much more mature and professional attempt at the whole idea, adding on a much deeper understanding of OOP.

Basic Usage

The simplest way to use this library is to simply pass a ModDB url to the parse function and let the magic happen.

import moddb
mod = moddb.parse_page("http://www.moddb.com/mods/edain-mod")
print(mod.name) #Edain Mod

Advanced Usage

Check out the documentation for more information

Installing

You can get it from pypi: https://pypi.org/project/moddb

pip install moddb

Models

Maybe

Glossary

Development

The necessary dependencies are stored in requirements.txt and requirements-dev.txt and can be installed with the following command

python -m pip install -r requirements.txt -r requirements-dev.txt

Testing

The testing is handled in two ways. There is a standard set of tests which are more lightweight, they test different entities of the library using one URL per entity. The second way is through the extended test suite. This extended test suite runs the standard test suite with multiple urls for each entity. This provides a better coverage but is also a lot more expensive to run and sometimes errors out because of ratelimits.

In general, if you're just trying to do a sanity check on the library it is recommended to use the standard test set using the cassettes. This minimizes requests dones to the ModDB server and your chance of being ratelimited.

Because tests in the suite grab the latest items from pages it is essentially not possible to have a zero request test. It is recommended to always run tests with at least the new_episodes record mode if you're planning to use the cassettes.

grep -r -l Service Unavailable * | xargs rm