govCMS / GovCMS7

Current stable release of the main Drupal 7 GovCMS distribution, with releases mirrored at https://www.drupal.org/project/govcms
https://www.govcms.gov.au/
GNU General Public License v2.0
112 stars 76 forks source link

As an agency I would like to import our annual reports as structured Drupal pages #159

Closed AnnualReportTeam closed 8 years ago

AnnualReportTeam commented 8 years ago

Every year our web team spend lots of time copying annual report content onto our website. Our annual reports are usually more than 100 pages in their print format and are highly structured. It is therefore desirable if those reports can be imported to the website using an automated process.

Key requirements:

It looks to us the HTML import module (https://www.drupal.org/project/html_import) provides the majority of those features and has been used to produce similar reports for some gov agencies. Could you please explore the possibility to bring similar function to GovCMS?

mgdhs commented 8 years ago

This is a problem for us too, but it is a content management problem from the very start. Is your master format InDesign like ours? That isn't a very portable or well managed/structured format. Ideally there was a format or system we could put Annual Report content into and it produced all versions. You may be better of creating your report in govCMS and using it to export to the print version.

cdriessen commented 8 years ago

We're also having the same discussions here and trying to determine the best way forward in terms of preparing/managing content in a central area and then publishing to different sources for different purposes (web, print). Ideally we would like a handful of staff to have access to create and edit content and for there to be a basic workflow/approval process. There seems to be pros and cons for creating the content directly in a CMS. I'm in the process of writing some requirements around both the content creation/management side of things and the publishing requirements for an AR site, so will try and add some more detail on this soon.

invisigoth commented 8 years ago

Latest InDesign, PDF and Word formats to well structured and accessible HTML. Annual reports of PM&C, ARC, Dept Industry, Austrade etc are all converted from InDesign and published on their websites. A growing issue however is how to integrate the converted HTML with an agency's website so they can be managed and searched.

It could potentially be a solution to use the above mentioned HTML Import or similar modules to suck the converted HTML to a Drupal site; then use the ThemeKey module to give the imported report desired look+feel and automatically assign taxonomy terms using the Taxonomy Autolink module.

Drupal is probably not the best collaborative authoring tool and formatting complex documents like an annual report will be pushing its WYSIWYG beyond the limit (here is an example: http://annualreport.arc.gov.au/2014-15/part-2-performance/chapter-5-programme-12-linkage.html). Although it is possible to export content from Drupal in XML format, convert to ICML and imported to InDesign, it could be very time consuming for graphic designers to then format the document.

fiasco commented 8 years ago

Could Drupal instead provide the data to InDesign via a web service? In this way, we can manage structured data in Drupal allowing us to index and format the data in an optimal and consistent way while allowing InDesign to manage the formatting of the same data for print? I know InDesign does allow data from external sources so it would just be a matter or architecting a web service endpoint in govCMS that suited that integration.

AnnualReportTeam commented 8 years ago

Reports are often under embargo (confidential) until they are tabled so having them in our Drupal site before handing to designers is not an option.

Our current workflow is to draft our annual reports in Word and Excel for the financial statements and hand the files to external graphic designers to format and style. The graphic designers are responsible for delivering both PDF and WCAG-compliant HTML.

WebProject2015 commented 8 years ago

We have used the Book module for structuring our latest annual report, it is great for structuring the content but doesn't help with the process of getting content into govCMS in the first place. The HTML import module would be very helpful with this process, and it would actually help for a number of our other tasks as well. In particular, getting new sites into govCMS would be so much easier with HTML import. It would enable agencies to bring in all those little subsites and campaign sites back into the main site.

On that note, our current plan is to use a development environment to enable the HTML import module, run our HTML imports, disable the module and then deploy the new content. This is not a great solution for a BAU process like annual reports, but it may be feasible given its only once a year, and it would help with maintaining the embargo.

invisigoth commented 8 years ago

Also see related issue #119

AnnualReportTeam commented 8 years ago

It's been a while since we opened this issue. We have just started planning the online version of this year's annual report and having the capacity to import the report to could be a deal breaker for us to move to govcms.

cvharrop commented 8 years ago

Hi AnnualReportTeam - could you reach out to govCMS@finance.gov.au (and cc chris.harrop@acquia.com) please? We'd like to set up a call to discuss.

Wongad commented 8 years ago

Hi @AnnualReportTeam, just reading through all the comments and issues mentioned it seems there's an alternative you could use instead of the HTML import module.

3 points from your comments which the new alternative solution can be catered upon:

  1. graphic designers are responsible for delivering both PDF and WCAG-compliant HTML
    • Since the Annual report is already delivered in HTML format, just have to zip it to upload as an Archive
  2. bring in all those little subsites and campaign sites back into the main site
    • The uploaded files are already in the main site, so no hassel in bringing back to the main site.
  3. maintaining the embargo
    • with Shield, the unique directory is already in embargo with the username and password set by the user.

This is what we've done when we've a ready compiled HTML file with customisation to be included in our site without any further creating of content on the govCMS Drupal. https://www.ppsr.gov.au/sites/g/files/net551/f/PPSR-Are-you-in-business.html

Hope this helps.

Just to note for the above solution and the HTML module, proper clean up is still required. http://blog.xing.net.au/blogs/how-convert-word-document-and-prepare-it-html-import

AnnualReportTeam commented 8 years ago

@cvharrop I'm going on maternity leave on Friday and have given this to our web team. In addition to the annual reports we are in the process to make our other corporate publications available in html and we really hope there will be a solution helping us improve our service soon.

@Wongad thank you for the suggestion. That's what we've been doing in the last few years. Ideally we'd like to have the report pages indexed, searched, classified and managed in drupal rather than as just static files. Users complained our annual report content was not searchable on our website and we need to fix it.

fiasco commented 8 years ago

Fixed in #217. Released in 2.0.