Open ebeshero opened 8 years ago
Alright @ebeshero, I pushed both the original text file and the csv file onto github. I tried running it through DB, and while it accepted the csv format, the map itself shows no markers. Help?
Hi @mjb232 Just got home after Honors Convocation...Let me take a look And remind myself how the data import works in CartoDB...
@mjb232 Aha! I just pulled in your CSV, and you've mostly got that right, but some of your coordinates look like this:
Lunigiana,44.2001 N. 10.1754
while most of them look like this:
Genoa,44.4056,8.9463
I see you've got problematic entries all marked like this: Holy Land,0.0,0.0 Gascony,0.0,0.0 (Which means? If they're plotted, they'll all be superimposed on top of the the prime meridian at the equator, I think... I think we could just remove those; what do you think?)
@mjb232 Okay--I spent a few minutes cleaning up the CSV a little and re-orienting myself to Carto-DB. It might be easiest if I talk you through this while you're in front of a computer: It's not hard, but you need a quick orientation to how Carto-DB works: If you use Skype, I can share desktops--I'm ebbondar on Skype...
@mjb232 In case a voice-meeting won't work, let me try writing it out here: 1) In your CartoDB account, you upload a dataset with your CSV. (You should probably delete the first dataset you uploaded so you can replace it with the new CSV I just pushed in to the project GitHub). To do that, look up at the top right where you see your Profile, and click on the word Datasets. Open up your dataset that you uploaded. Click on Edit (in the upper right of the screen), and click the red "Delete this dataset..." Then, go back and import our new dataset.
2) Now, to get points to plot on your map, CartoDB needs to know how to read them (much like Cytoscape needs to know how to understand "nodes" and "edges" from your imported text file). What you need to do in CartoDB is, open up the new dataset, locate the column that holds your Latitude numbers (N/S coordinates), and the column that holds the Longitude (E/W coordinates). (Be sure you know the difference!) Jot down somewhere handy the name of the column in CartoDB--you'll need it in a second.
3) When you know which columns are which, click on the dropdown field for the appropriate column, and click "Georeference". That will bring up a new screen asking you to Select the column that contains your Longitude (remember, E/W coordinates) and your Latitude (N/S coordinates). Do that and click Continue.
4) You'll notice that CartoDB now assigns a "cartodb_id" and yields a new column of data, "the_geom" which is really their own longitude,latitude CSV pairing of the geocoordinates.
5) Now, go up to the top right and click "Visualize", and on the next screen, "OK, Create map". You'll probably be brought back to your dataset view...but now you are ready to toggle between "Data View" and "Map View"--it's in the top center of your screen. Select "Map View".
6) Voila! You have plotted points! But you now want to play around with styling your map. Get rid of the "Geocoding Dataset window" in the bottom left (if you have it), and you'll see a "Change basemap" option: Play with that and see what the available basemaps are. Choose one you like!
7) On the right of the map screen, mouse over the available options. The paintbrush gives you some options for styling the way the points are shown--and it'll give you some interesting options, to help visualize some nifty things, like clustering--(highlighting which regions on the map are getting the most points, for example). Play around with the options and see what you think is most meaningful.
The cluster map is pretty nifty! But notice that some of the display options (like category) won't make sense because you haven't imported info about these points like we did with the network graphs...Bubble isn't making the node sizes bigger for number of points or anything--it's just making the HIGH LATITUDE numbers have bigger circles--that's kind of not very meaningful, is it? (But if you had a column of output that read how often a particular point was referenced in the full text of the Decameron, you might use try importing that in a new CSV, and see if you can use a new column to control node size. Heatmap and Intensity are old favorites of mine! Check out these that I made last summer, and notice what happens when you zoom in and out! http://ebeshero.github.io/thalaba/maps.html
8) Okay, you're going to have to Export and/or Publish your map so you can put it on your website! You can make a couple of different map displays that you like (change the base map, change the display options) and publish each one for your project team.
@RJP43 @spadafour
Here's an output PNG (non-interactive) that I was just messing around with for fun:
Sorry @ebeshero, I've been away from my computer all weekend. Ok, so what do you need me to do now?
@mjb232 Read the notes I posted for you in this comment thread on how to work with CartoDB, and pull the new dataset in. Let me know if you have trouble.
@mjb232 how's it going? If you want me to talk you through any of it, let me know or just ping me here! Most of this is just learning your way around the CartoDB interface. It would be great if you could give @laurenmcguigan some map output (and code to embed a zoomable map in an HTML page) tomorrow so we can figure out where to put it on the site she's working on!
@ebeshero I have a map! Also, I just had the idea to get the location of where the place is mentioned in the text (the day/story) and adding that to the map. That could be useful info and will make our map look cooler.
@mjb232 Great idea, Matt! So you can import at as a column into the map software, I think...and it could be pulled in with the map labels... :+1:
That's what I was thinking. Now the issue is getting the column. I'm in xide now fiddling with some code. I might need some help with this, so just keep your eyes open in case I need reinforcements
Will do! I'm working online all day.
@ebeshero Real quick, does exide work with the ancestor::
? I know it doesn't do sibling::
.
@mjb232 BOTH of those axes will work in eXide, definitely! Yes, you can use ancestor::
and sibling::
and really any of our friends from the XPath window. What's not working?
Remember you need the right TEI namespace line at the top of the XQuery script, and you might need to be using the tei:
prefix on your elements if you're not getting any output...
@ebeshero that was one of the problems! I also used a : instead of an = at some point. This is what I have so far. Essentially im trying to say: for each place name give me the ancestor div[@type=novella]/head
. The only issue is that im returning 1001 results, and there should only be 190ish
xquery version "3.0";
declare default element namespace "http://www.tei-c.org/ns/1.0";
declare variable $decamDoc := doc('/db/decameron/engDecameronTEI.xml');
declare variable $novellas := $decamDoc//div[@type="novella"];
declare variable $placeNames := $decamDoc//placeName;
for $places in $placeNames
let $location := distinct-values($places/ancestor::div[@type="novella"]/head/string())
return $location
@mjb232 Ohhh! Here's the problem: For your map, we only output each distinct value of place (because we only want one pin for each place regardless of how many times it comes up in the text). That's not a deal-breaker, but we need to work around it...
@mjb232 So, let's a make a plan for this:
distinct values()
, then loop through with a "for loop"
and output a bunch of information from the XML tree for each member of the distinct-values list. You could output that list of multiple locations in the XML text using a nice tidy string-join()
.,
as a separator in the string-join()
function.--it'll just create a whole bunch of new, uneven columns in your CSV.@mjb232 Just updated my comment above with some more detail/helpful hints.
I dont believe that Carto supports TSV, so where does that leave us?
I think they might...did you give it a try? I was about to make a tiny little file with TSV and see what happens on upload...
I was just looking through their tutorials and I just used crtl+f
to find TSV, but nothing came. I searched for CSV and got results though. But i might have missed it so give that a try
Most folks don't know about TSV and I think they just don't have documentation posted! My instinct is to test it because I think we can choose what values are "delimiting" the columns on import. I'm in the middle of hammering out a review guide for American Lit, though, so if you'd like to do the test import, I'd be grateful!
I got you! You focus on that haha.
Thanks! But keep pinging me and I'll reply quick. When I need to debug code, I'll do it, but in the meantime, the empirical data testing is yours. :-)
Also, I think Lauren needs help with the HTML, so you should check that out real quick
@mjb232 Just a quick heads-up: We're meeting in class as usual tomorrow (Monday)--the presentation trip is not until Wednesday!
Yeah, I'm not sure why I thought it was tomorrow. Anyways, Tsv works! So now I just need to get my list, put it in a text file and upload that to CartoDB as a new column
Yay!
@ebeshero Ok, so back to my actual code. So, I went to put distinct-values around the places
variable, but i get the error that says that (london) can be a node set. What does that mean? Here is the code so far. Also, am I approaching this the correct way?
xquery version "3.0";
declare default element namespace "http://www.tei-c.org/ns/1.0";
declare variable $decamDoc := doc('/db/decameron/engDecameronTEI.xml');
declare variable $novellas := $decamDoc//div[@type="novella"];
declare variable $placeNames := $decamDoc//placeName;
for $places in distinct-values($placeNames)
let $location := $places/ancestor::div[@type="novella"]/head/string()
return $places
@mjb232 Remember, when you take distinct-values()
you turn nodes into strings. You just make a string of text and it loses context in the XML tree. The distinct-values list of place names literally cannot have an ancestor::*
of any kind because it's not XML any more.
@mjb232 But fortunately, see above, and see past homeworks for what we do. That's why we make a "for loop" and test every member of the list of distinct values to find out about how often and where it shows up in the XML tree...
Okay: Now that you have a map, let's kick this up to phase 2 and do this independently from CartoDB! See the new issue posted here: https://github.com/jlm323/DecameronProject/issues/33
@mjb232 @laurenmcguigan @RJP43 @spadafour Here are some leads for mapping that I've been testing:
1) Output a list of distinct places marked in the Decameron Project. You can pull names from either the Italian or English files, or both. The Italian file has some good detail marked in the attributes, defining cities from regions, for example.
2) Run the list through one of the following to "geocode" it--or add latitudes and longitudes from a lookup service. In the past we've tried: http://www.gpsvisualizer.com/geocoder/ which can output a KML data format: KML is a kind of XML that is used for mapping. (Some web mapping services import KML, like Google Maps.) Some others take formats like TSV or CSV).
I have a tutorial/exercise on generating and plotting KML out of a different project team's files here: http://newtfire.org/dh/XQueryExercise3.html (It's a homework assignment we displaced this semester with network analysis.)
OR try to do the lookup in XQuery directly: I have been experimenting with a service called geonames, which lets you set up a free account and run search queries: I managed to reach them inside eXist: http://www.geonames.org/export/web-services.html but I'm having trouble accessing multiple files--I can look at one at a time, but I can't output many of them. I'd like to show you a little about how these geocode services work. I think the GPS Visualizer I linked to in 2) is probably our best bet, but we need to test it...