thequbit / mayan-document-listener

A bridge between the mayan-edms and BarkingOwl
GNU General Public License v3.0
1 stars 0 forks source link

define database structure using sqlalchemy #4

Closed thequbit closed 10 years ago

thequbit commented 10 years ago

create models.py file and write classes for database architecture.

thequbit commented 10 years ago

sufficient logging should be included for auditing purposes.

add data should remain in the database and simply flagged as processed. Purging can occurs using external tools, or built in within later versions.

The payload from BarkingOwl will look like:

payload = {
    'command': 'found_doc',
    'source_id': self.uid,
    'destination_id': 'broadcast',
    'message':  {
        'doc_url': http://timduffy.me/document.pdf',
        'link_text': 'some document',
        'url_data': {
            'target_url': "http://timduffy.me/",
            'title': "TimDuffy.Me",
            'description': "Tim Duffy's Personal Website",
            'max_link_level': 3,
            'creation_datetime': '2014-07-17 21:34:18',
            'doc_type': 'application/pdf',
            'frequency': 2,
            'allowed_domains': [],
        },
        'scrape_datetime': '2014-07-17 21:34:17',
    }
}

target_url, doc_url, link_text, scrape_datetime, and maybe source_id should be saved to the database.

thequbit commented 10 years ago

After further thought, I think that since the schema is unknown (due to the variability of the url_data dict), a no-sql database (mongodb) is the right answer for the document queue between the AMQP message bus and the upload_single_file() function. Closing ticket.