programminghistorian / ph-submissions

The repository and website hosting the peer review process for new Programming Historian lessons
http://programminghistorian.github.io/ph-submissions
140 stars 115 forks source link

Review Ticket for Geocoding Historical Data using QGIS #27

Closed acrymble closed 7 years ago

acrymble commented 8 years ago

The Programming Historian has received the following tutorial on 'Geocoding Historical Data using QGIS' by @justincolson. This lesson is now under review and can be read at:

http://programminghistorian.github.io/ph-submissions/lessons/geocoding-qgis

I will act as editor for the review process. My role is to solicit two reviews from the community and to manage the discussions, which should be held here on this forum. I have already read through the lesson and provided feedback, to which the author has responded.

Members of the wider community are also invited to offer constructive feedback which should post to this message thread, but they are asked to first read our Reviewer Guidelines (http://programminghistorian.org/reviewer-guidelines) and to adhere to our anti-harassment policy (below). We ask that all reviews stop after the second formal review has been submitted so that the author can focus on any revisions. I will make an announcement on this thread when that has occurred.

I will endeavor to keep the conversation open here on Github. If anyone feels the need to discuss anything privately, you are welcome to email me. You can always turn to @ianmilligan1 if you feel there's a need for an ombudsperson to step in.

Anti-Harassment Policy

This is a statement of the Programming Historian's principles and sets expectations for the tone and style of all correspondence between reviewers, authors, editors, and contributors to our public forums.

The Programming Historian is dedicated to providing an open scholarly environment that offers community participants the freedom to thoroughly scrutinize ideas, to ask questions, make suggestions, or to requests for clarification, but also provides a harassment-free space for all contributors to the project, regardless of gender, gender identity and expression, sexual orientation, disability, physical appearance, body size, race, age or religion, or technical experience. We do not tolerate harassment or ad hominem attacks of community participants in any form. Participants violating these rules may be expelled from the community at the discretion of the editorial board. If anyone witnesses or feels they have been the victim of the above described activity, please contact our ombudsperson (Ian Milligan - http://programminghistorian.org/project-team). Thank you for helping us to create a safe space.

acrymble commented 8 years ago

This is just a note that we now have 2 reviewers who have agreed to conduct the formal reviews. We await their comments!

adamdennett commented 8 years ago

I really liked this lesson. As a guide to mapping and geocoding using QGIS and LibreOffice, I found it (on the whole) straightforward to follow and useful for anyone who is not familiar with the QGIS software or some issues in turning data into something that a GIS can visualise. A few specific points:

  1. Under Lesson Goals rather than referring to latitude and longitude, I’d be a bit more generic as lat long is only one way of representing coordinates. Suggest change to: “Mapping data such as these involves rendering spatial information which humans can understand (such as names of towns or counties) into a format that can be understood by GIS (mapping) software: some kind of geometry (a point, line or polygon in a vector representation) relating to the coordinates of this element in two dimensional space – these might be latitude and longitude or as is often the case in the UK, eastings and northings of the British National Grid.”
  2. P3 Again, under ‘georeferencing’ – suggest change to: “This involves specifying latitude, longitude coordinates and scale.”
  3. P4 Under ‘geocoding’ – suggest change to ‘geometries on a map’ from ‘points on a map’ as could be geocoding a line or a polygon.
  4. Under ‘joining tables and maps’ – section talking about ranges and colour ramps. Where you say “you may wish to experiment with this and select a different number of classes, or a different method, such as quantiles.” Might be worth mentioning there’s a lot of theory associated with data classification. Consulting here: http://wiki.gis.com/wiki/index.php/Classification is a pretty good place to start. Also, in the newer (2.10 and above) versions of QGIS, clicking on the Histogram > Load Values option is a really nice way of showing how the data are chopped up using the different classification methods.
  5. Under the geocoding with a gazetteer tutorial section. The bit that says “Select Vector>Geometry Tools>Polygon Centroids. Select a new name for the resulting Shapefile such as CountiesCentroidsand select add to canvas” Another option is to go for the Vector > Geometry Tools > Export/Add Geometry Columns – not much difference though…
  6. Point 4 under ‘Geocoding your data table’. The bit that says: “Repeat for the second table and look at each to refresh yourself on the contents of the columns.” - I think here you should be clear that the second table is the centroids table created earlier. It’s also worth pointing out that when you try and copy and paste from calc the columns seem to default to ‘text’ – even when they are integers or doubles so make sure you point out that you should import as the correct data type or it will cause problems later on…
acrymble commented 8 years ago

Thanks @adamdennett. We're just waiting on a second review before I'll summarise everything and @justincolson can respond. If it takes me a few days to summarise I hope you'll forgive me. It's summer :)

leonrobichaud commented 8 years ago

General comments As more historians work to integrate GIS in their research process and tackle what can be a steep learning curve, a tutorial dedicated to geocoding historial data is a welcome addition to The Programming Historian. Apart from the technical skills which can be learned from any general GIS geocoding book or tutorial, the conversion of historical data in a format which can be used by GIS software also requires a critical approach to avoid potential pitfalls. This tutorial does a very good job at bringing up various issues related to historical data and answers some of the questions historians have about how to associate historical data to the type of geometry used in GIS. The idea of first showing how to join a table with a map is excellent in that it allows the user to quickly create a map and see results, which provides immediate positive feedback.

This type of tutorial requires many steps and is followed by different types of analysis which can only be briefly covered. It can be difficult to define the limits of the tutorial. Should a QGIS data analysis tutorial be added to The Programming Historian, links to such a tutorial could be added to the current one.

The following comments are geared at clarifying some steps, thereby smoothing out the learning curve. I analysed this tutorial as part of the "Mapping and GIS" series of tutorials. Should the series editors and the author reject this premise, some of the following suggestions can be disregarded.

On the whole, the instructions are fairly clear (the screen captures did not display for some reason) and I followed it using a French version of QGIS and of LibreOffice installed on Kubuntu Linux. At the moment, I would suggest that the tutorial is easy to follow for users who are familiar with QGIS and LibreOffice. If a broader audience is to be reached, some more explicit instructions would be necessary.

Steps that are evident for experienced QGIS users could be explicitely stated for newer users. Although users would normally have followed the "Installing QGIS 2.0 and Adding Layers" tutorial, they may not have interiorised some of the concepts or become familiar with all of the steps. Given that some users will follow each step mechanically, names for files and directories to be created should be explicitely proposed and used later in the tutorial, as was done in the Introduction to Python tutorials, which I have used several times in class to initiate history students to programming.

Tips, notes, definitions and other information which are not technically part of the geocoding process are currently presented in various parts of the document. Does the Programming Historian tutorial format allow for a "About Geocoding" section which could group all of this information?

  1. Lesson Goals The text is structured to present what you can do with geocoding. A simple reformulation would present this as goals for the lesson, which would be more consistent with other tutorials. The distinction between georeferencing and geocoding clarifies an issue which can confuse many beginners and could be placed in a "About Geocoding" section if possible.
  2. Lesson Structure The notes which follow the presentation of the structure could be added to a "About Geocoding" section.
  3. Getting Started The lesson states the reader should have already installed QGIS and followed the Installing QGIS 2.0 and Adding Layers tutorial. References to the separate GDAL framework installation on Mac can seem redundant given that this is already specified in the other tutorial. A link to the LibreOffice installation instructions (https://www.libreoffice.org/get-help/install-howto/) would simplify matters for users who need help in this regard.
  4. Part 1: Joining Tables and Maps
  5. In the passage regarding the coordinate system, a link could be made (as in the "Installing QGIS" tutorial) to the "Working with projections tutorial".
  6. The instructions for downloading the historic counties of England and Wales should use the same titles that are used on the Historic Counties Trust page. Some users will hesitate when selecting which file to download.
  7. Even though the Add Vector Layer operation uses default options, a screen capture of the dialog would be useful, in case a user has accidentally clicked on other options. For some users, a reminder to click on the Browse button can be useful when they want to choose the directory and name the file.
  8. The item numbered 1 for the addition of a delimited text layer is a description of the file and not an operation. It could be part of the presentation text instead of a numbered step.
  9. In step 4 of the join operation, instead of refering to "the new table imported from the CSV file", it would be more explicit to refer to a specific table name.
  10. Part 2: Geocoding Historical Data
  11. This part is separated into three sections, an introduction, a tutorial on geocoding with a gazetteer and a section on geocoding your own historical data. Given that there are regions for which historical gazetteers do not yet exist, a link to the "Creating New Vector Layers" tutorial could be added to remind users where to go if they want to embark on that adventure.
  12. Part of the introduction to this section could be moved to a "About Geocoding" section.
    • Geocoding with a Gazetteer.
    • The first set of instructions are integrated in the paragraph instead of appearing as a numbered list. Changing the format to a structured list will make it easier to follow.
    • The instructions on importing the data into LibreOffice Base is grouped as one step. Numbered substeps could be specified.
    • The Troubleshooting Database Gazetteer Joins section will be consulted by users when they want to apply this technique to their own data. It could be moved to the Geocoding your own Historical Data section.
    • The Points in Polygon analysis method is mentionned, but the steps are not shown.
    • The Point Displacement display steps should be presented as a numbered list instead of being grouped in the paragraph. The steps to generate a HeatMap could be explained. If possible, a first set of values that can be tweaked by the user should be provided to create an initial view. If a tutorial on data analysis and presentation is created, a link to such a tutorial would resolve the problem.

In conclusion, the tutorial works well for users familiar with QGIS who want to learn a new skill. Users who have only done previous tutorials and who have no other QGIS (or general GIS) experience will probably hesitate at many points during the tutorial. I hope that the suggested clarifications can enhance their experience and help broaden the base of historians who use GIS.

acrymble commented 8 years ago

Thanks to you both. I'll ask that we now close the public reviews and I will summarise the above points for Justin to respond to. This may take me a few days, but I'll do it as soon as I can.

acrymble commented 8 years ago

Thanks to @adamdennett and @leonrobichaud for these comments. I think they're quite straightforward actually. @adamdennett has offered some suggestions for further grounding the suggestions in the theories and practices of geographers, and @leonrobichaud has given some advice on flushing out the lesson for people who might not already be familiar with QGIS and mapping, which might make it more useful for use in a classroom setting. In particular, he suggests a brief 'about geocoding' section.

@justincolson I will turn this over to you to respond to the comments as you would in any peer review. Please leave us a message here letting us know what you have and have not done so that we can get a clear view of your revisions.

To ensure this is published without undue delay, I'd ask that you complete your revisions within 4 weeks (30 Sept 2016). If that timeline is going to cause you a problem please email me and we can work something out.

I look forward to seeing your revised tutorial, and thanks again to our reviewers.

acrymble commented 8 years ago

@justincolson can you give us an update?

acrymble commented 8 years ago

@justincolson I haven't heard from you on this, so I'm going to close this submission. You can re-open it if you like but I won't be following up.

justincolson commented 8 years ago

Really sorry for the lack of updates on this Adam, I've been utterly inundated with teaching new modules this term. This should start to calm down dramatically after next week, and this was on my to do list for then. Apologies for not updating you to this effect before now. The comments are very constructive, and I'd be happy to revise the tutorial in light of the suggestions, in due course, if you would still welcome the tutorial? I would hope to be able to do so before the end of November, and certainly no later than Christmas.

justincolson commented 7 years ago

Hi @acrymble , again really sorry for the delay on this. I've now completed a revision of the tutorial in a fork on Github, hopefully successfully incorporating all of the reviewers' comments. I taught another workshop at the IHR today and successfully used the second part of this tutorial - the participants were able to work through without any significant problems.

I'm still getting used to Github - should I create a pull request at this stage?

acrymble commented 7 years ago

Thanks @justincolson . We've made it a bit easier by making you a member of this repository. You can just cut and paste the entire new lesson over top of the original one:

https://github.com/programminghistorian/ph-submissions/blob/gh-pages/lessons/geocoding-qgis.md

Let me know when you've done that and I'll have a read.

justincolson commented 7 years ago

Thanks @acrymble - done now! I never had any luck getting the images to work - is it just a case of the links being broken all the time its in the submissions site, rather than the live one?

Many thanks also to the reviewers! The comments were very helpful and constructive.

acrymble commented 7 years ago

I've had a chance to go through this. I've done the formatting/copyediting bits, including numbering figures and removing the ordered lists and using unordered lists instead. This was because our styleguide was re-starting each list at 1 every time you had more than one paragraph. Figured it was less confusing to just get rid of the numbers.

More substantially, I've reworked your intro, which I think takes it back half-way between what it was in version 1 and 2. I was concerned that there wasn't enough of a sell in the current intro to keep readers going. Let me know what you think.

Secondly, I've added a couple of tables in Part 1 (Paragraph 17: http://programminghistorian.github.io/ph-submissions/lessons/geocoding-qgis), which I hope might help illustrate that point a bit clearer. Take a look and make sure I haven't messed anything up.

I've also gone through and cleaed up language and added in a number of extra links where jargon appeared.

Please take a look and make sure you're happy with everything. If so we can move it over to the main site.

As for an image to represent the lesson, I suggest this:

https://www.flickr.com/photos/britishlibrary/11064886903/in/album-72157640584771663/

Has that idea of making links/connections, which I think will work. But if you have another idea, let me know.

leonrobichaud commented 7 years ago

Hello Justin and Adam,

Congratulations on an excellent tutorial for geocoding historical data. I will be recommending it to students and colleagues.

I just wanted to mention that I noticed a typo. In 2 places, AlumniCounties_Count Place of Origin is written AlumiCounties_Count Place of Origin (Sections 24 and 27).

Thanks for a new ressources in HGIS.

Léon

acrymble commented 7 years ago

Thanks @leonrobichaud. Fixed.

justincolson commented 7 years ago

Many thanks for the extra tweaks @acrymble - the tables really do make it clearer! I had imagined using a more straightforward pins sticking out of a map type image, but actually I like the logic of the image you suggest. Happy for publication to go forward whenever you're ready!

acrymble commented 7 years ago

@ianmilligan1 or @wcaleb would either of you mind helping me move this lesson and its images: https://github.com/programminghistorian/ph-submissions/tree/gh-pages/images/geocoding-qgis to the main site?

We're ready to publish and everything else should be good to go.

ianmilligan1 commented 7 years ago

Sure! I am just going into a meeting but will do this either today or over weekend.

ianmilligan1 commented 7 years ago

actually just moved 'em over for you. the images and the lesson md file. let me know if you need anything else. 😄

acrymble commented 7 years ago

Thanks @ianmilligan1

acrymble commented 7 years ago

This lesson has now been published. Thanks @justincolson for your hard work, as well as for the workshop that was the impetus for the lesson.

It's Friday night so probably not the best time to promote it, but in order to maximise the uptake of the lesson, please spend some time next week to let people know about it. We find the most used lessons tend to be the ones that authors themselves refer to in writing and teaching.

Thanks also to @leonrobichaud and @adamdennett for your comments.

http://programminghistorian.org/lessons/geocoding-qgis

justincolson commented 7 years ago

Hi Adam,

I'm afraid I've just had a slight error pointed out to me on some of the wording in one of the sections you added, which I hadn't noticed last night. Just a missing word really:

You can say there are 50 students from the county Essex in your data, and thus link that to your Essex shapefile (Table 1). But you cannot store the data as 50 rows, each of which represents a single student that points to the Essex shapefile (Table 2). One shapefile, one value.

This should probably read:

You can say there are 50 students from the county Essex in your data, and thus link that to the Essex polygon feature in your shapefile (Table 1). But you cannot store the data as 50 rows, each of which represents a single student that points to the Essex feature in your shapefile (Table 2). One shapefile feature, one value.

The point is that we're talking about what is effectively a row within a table (which happens to be a shapefile) rather than the whole shapefile itself.

Sorry I didn't spot this before - always the way!

Best

Justin

On Fri, 27 Jan 2017 at 19:11, Adam Crymble notifications@github.com wrote:

This lesson has now been published. Thanks @justincolson https://github.com/justincolson for your hard work, as well as for the workshop that was the impetus for the lesson.

It's Friday night so probably not the best time to promote it, but in order to maximise the uptake of the lesson, please spend some time next week to let people know about it. We find the most used lessons tend to be the ones that authors themselves refer to in writing and teaching.

Thanks also to @leonrobichaud https://github.com/leonrobichaud and @adamdennett https://github.com/adamdennett for your comments.

http://programminghistorian.org/lessons/geocoding-qgis

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/programminghistorian/ph-submissions/issues/27#issuecomment-275748255, or mute the thread https://github.com/notifications/unsubscribe-auth/ASvW2luUB9y3_FCVKdDpU5FiUmSyvKlLks5rWkFigaJpZM4JFVPY .

acrymble commented 7 years ago

Thanks @justincolson. Updated.