govmeeting / govmeeting

Software to increase citizen involvement in democracy at the local level
MIT License
29 stars 5 forks source link

Custom code plugin system #97

Open johnpankowicz opened 4 years ago

johnpankowicz commented 4 years ago

Some custom code is likely needed to support individual government bodies. For example, the following code would need to be customized. Code to:

It is suggested to build a plug-in system that allows this custom code to be written in any language: Python, Java, Javascript/Typescript, C++, C#, PHP, Powershell, etc. This would widen the pool of programmers who could write this custom code for their specific community.

The custom code is currently written in C# and needs to be compiled into the WorkflowApp. Transforming transcripts works in this way. In BackEnd\ProcessMeetings\ProcessTranscript_Lib, you will see custom C# classes named:

WorkflowApp used the Reflection API to locate and execute the "Fix" method in the correct class. For example, assume a transcript file from Austin, TX in Travis County, USA arrives in the DATAFILES/RECEIVED folder. WorkflowApp will search for the correct class, and then execute it's "Fix" method.

It is suggested to extend this concept to allow external executables to also handle this customization. If the C# class "USA_TX_TravisCounty_Austin_CityCouncil_en" did not exist, it would search for:

jjonas3 commented 4 years ago

What would you think about creating a standardized input file, publish the specs, then different entities could write code in whatever language they wanted. It could be XML or JSON, or something similar.

johnpankowicz commented 4 years ago

Yes, that is likely the best way to handle it. The project uses JSON internally now for almost all its data processing of transcripts, so that is a natural choice . The way that the "Fix" routines work right now is that they take any arbitrary text string of a transcript and output a more standard text format. But then the very next step is that the program converts this text format into a JSON object. Probably we should specify that the custom "Fix" routines directly return the JSON object. For example: A portion of the incoming Philadelphia transcripts look like this:

             - - -

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com) Stated Meeting December 7, 2017 (215) 504-4622 STREHLOW & ASSOCIATES, INC. Page 2 1 12/7/17 - STATED - INVOCATION 2 COUNCIL PRESIDENT CLARKE: Good 3 morning. We're getting a very late

The normalized output looks like this:

Section: INVOCATION

Speaker: COUNCIL PRESIDENT CLARKE Good morning. We're getting a very late start, so we'd like to get moving. I'm going to ask everyone, visitors to retire behind the rail. If the members

The JSON output looks like this:

> { "data": [
> 
>     {"speaker":"COUNCIL PRESIDENT CLARKE","said":"    Good morning.  We're getting a very late start, so we'd like to get moving.  I'm going to ask everyone, visitors to retire behind the rail.  If the members will take their seats, we'll have our invocation.\n    And to give our invocation this morning, the Chair recognizes Pastor Mark Novales of the City Reach Philly in Tacony.  He is here today as the guest of Councilman Bobby Henon.\n    I would ask all guests, members, and visitors to please rise.\n    (Councilmembers, guests, and visitors rise.)\n","section":"INVOCATION","topic":null,"showSetTopic":false}
> ,
>     {"speaker":"PASTOR NOVALES","said":"    Good morning, City Council and guests and visitors.  I pastor, as was mentioned, a powerful little church in -- a powerful church in Tacony called City Reach Philly.  I'm honored to stand in this great place of decision-making alongside great men and women of influence who have great purpose.\n    I believe that everyone in this Chamber has mor
johnpankowicz commented 4 years ago

However, there is one advantage to letting people just convert to a normalized text output instead of JSON. Then people, who only know something like Perl or Awk and how to use regular expressions, write a conversion routine.

jjonas3 commented 4 years ago

Well, we could write a conversion routine for text to JSON. But I think it would be a lot of work to convert different meeting formats

johnpankowicz commented 4 years ago

That's the purpose of what I called "normalized output".

Section: INVOCATION

Speaker: COUNCIL PRESIDENT CLARKE
Good morning. We're getting a very late start, so we'd like to get moving. I'm going to ask everyone, visitors to retire behind the rail. If the member ...

The plugins would be responsible for converting the original transcript to this "normalized output". Then our existing routines will convert it to the JSON that we want. I guess I'm thinking of ways that a high school student just learning to program could write the plugin for his/her town.

However, I wonder how may cities actually produce transcripts of meetings. I've tried searching the larger city's websites in many of the U.S. states and I've only so far found three that have them available online -- Philadelphia, Austin and Los Angeles. And as I said on our conference, it appears Los Angeles no longer has them. So this custom code for handling different transcript format may not be much needed.

The other custom code that may be needed is to:

jjonas3 commented 4 years ago

Sounds good. We could have the JSON or the normalized output as inputs to the system. Whichever as easier for them.

johnpankowicz commented 4 years ago

I was thinking about ways to allow custom plugins for web scraping. I’d like to make it as easy as possible for beginner programmers. When I’m on some programmer forums, I notice that there are usually some aspiring programmers, who are eager to get their feet wet, writing some useful code. Maybe we can provide them an opportunity.

There could be sub-repositories in github.com/govmeeting entitled “custom_php”, “custom_powershell”, etc. Each of these sub-repos would have:

  1. A library of helper routines in that language for web scraping. It would have methods such as:

    • LoadPageAsHtml
    • LoadPageAsString
    • ExtractAllLinks
    • ExtractAllTables … etc …
  2. A few samples of using these routines to scrape a town’s website for:

    • a list of upcoming meetings
    • a list of town officers’ names and positions
    • a link to the video of the last meeting (if that’s available)

The actual custom code, that will need to be written, should then only be a few lines.

For now, to make it easy on ourselves, I’d like to just create sub-folders in our existing repo, instead of creating sub-repos. I can create the following folders:

Also, instead of using sub-repositories, there is a way to use “git clone” to only clone a specific folder in the main repo. Thus, someone who wanted to just work on a PHP plugin would only need to clone “BackEnd\Custom\PHP”

But eventually converting to sub-repos has the following advantage. End users who want to install Govmeeting on their own computers, will probably not want to learn Git. But they may still want to build a custom plugin. With separate repos, they could:

The helper libraries for all the languages will already be installed with the full Govmeeting installation.