Dapscoptyltd / QGIS

All things related to QGIS in here.
0 stars 1 forks source link

Technical Log: Extract Pdf and rectify for readability #7

Open Dapscoptyltd opened 7 years ago

Dapscoptyltd commented 7 years ago

Primary research using digitized sources can present problems like this one described in this file. The file source was: “RCDIG1014090”: AWM4 Australian Imperial Force unit war diaries, 1914-18 War; Artillery. Item Number: 13/32.19; Title: Headquarters, 4th Australian Field Artillery Brigade; October 1917. In this ile “RCDIG1014546 Pages extract.pdf” we find a typical problem I discovered and have had to deal with. The original page was inserted backwards into the file, and when the file was digitised, not checked. The adjustments I made in here were done with the Adobe Creative Suite, Student edition. It’s fearsomely expensive and there are other graphics software available in the market (open source) that can do this. • It was upside down and backwards: screen shot 2017-05-07 at 10 37 51 Figure 1: The original page in the AWM war diary file. I extracted it from the diary file as a single page pdf.

  1. Open the Page Thumbnails pane. View > Show/Hide > Navigation Panes > Page Thumbnails.

screen shot 2017-05-07 at 11 12 53

Figure 2: Display the page thumbnails pane.

  1. Right click on the page you need to extract.
  2. Select “Extract Pages”, as shown in Figure 3.
  3. Select the appropriate settings.
  4. Save the new file. screen shot 2017-05-07 at 10 01 06

Figure 3: Extracting is a simple right click process (left) in Adobe Acrobat Professional, then following the Extract Pages prompts (right). Corrections in PhotoShop

  1. I opened PhotoShop.
  2. I opened the file “RCDIG1014546 Pages extract.pdf”.
  3. I rotated the image 90º anti-clockwise Image > Image Rotation > 90º Counter Clockwise.
  4. I then flipped the image horizontally. Image > Image Rotation > Flip Canvas Horizontal.
  5. I saved my changes as “Pages from RCDIG1014546 corrected.pdf”. See below in Figure 4. screen shot 2017-05-07 at 10 56 16

Figure 4: This is the result of rotation and flipping horizontal. I then had to darken the text. • I initially tried to select and re-colour using the Magic Wand tool and colour change. This was an inappropriate fix. • I spent less than two minutes doing that to demonstrate to myself this was not workable. I then went back to basics.

  1. I opened the Channels window. This allows me to examine the levels of contrast present across the range of red, green, blue spectrums in the digital colours of the file. This is a complex, and for the purposes of this example, out of range discussion, except to say, I was looking for the SIMPLEST way to be able to read the file’s text!
  2. I tried a number of colour combinations by clicking the RGB channels on and off. This allowed me to conclude that brightness and contrast changes would assist readability.

screen shot 2017-05-07 at 10 54 28 screen shot 2017-05-07 at 10 54 39 screen shot 2017-05-07 at 10 54 53 screen shot 2017-05-07 at 11 21 03

Figure 5: Selecting RGB channels helped diagnose brightness and contrast changes would improve the text readability.

  1. I selected Layer > New Adjustment Layer > Brightness/Contrast. • There are myriad reasons for this. The primary ones are that this method overlays a brightness and contrast layer which you can turn off or on. • Layers do not affect the original page (unlike the rotations and flips did). • For the purposes of this technical task, I can turn the layer off during preparation of this file. screen shot 2017-05-07 at 11 27 33

Figure 6: Top left: The Channels window or dialog. Note the Brightness/Contrast layer is turned off, as compared to the others with the ‘eye’ icon. Bottom Left: The dialog adjusting the Brightness and Contrast, and the final settings. Bottom Right: The Layers window. This is one of the most important tools in Photoshop, and once understood, is a powerful way to wrangle Photshop.

  1. I adjusted the Brightness and Contrast settings as shown in Figure 6 (bottom left). This produced the results for text readability as shown in Figure 6 (top right).
  2. Save the file. I saved it as “Pages from RCDIG1014546 corrected adjusted.pdf”.