Closed acrymble closed 6 years ago
Well after https://ihrdighist.blogs.sas.ac.uk/2018/04/16/22-may-2018-workshop-using-space-syntax-methods-to-explore-the-distribution-of-meeting-places-in-19th-century-historic-maps/, I'd like to see something on Spatial Network Analysis of historical data (with DepthMap)
I think the 4 topics should still be among our top suggestions. I d like to add also topics such as named entity recognition and information extraction ; there are a number of OS tools and technologies out there , mainly from the NLP world eg https://nlp.stanford.edu/software/CRF-NER.shtml.
Would love to get some sort of audio lesson in the mix - I've reached out to a handful of people in the past year but haven't been able to find someone who has the time to commit just yet. And I agree that the original four topics should also stay in the mix if there is room.
@amsichani though I think having a tutorial on how to use NER is a great idea, it doesn't work well yet in Spanish. Of course, the fact that it didn't work for my dissertation (using R) doesn't mean that the system is disqualified (it works great in English) but I just wanted to let you know about potential problems for a global audience. I include two examples:
I agree with @jenniferisasi - my experiments with NER (specially geographical places) were a bit disappointing. I heard that for Spanish Freeling is better than Stanford... but I am not sure, I have not tried. In any case, if we ever get a submission about NER or NLP, please take into account the difficulties pointed out here.
@walshbr can you give a more specific request I can include on audio?
@acrymble sure. How about -
How can we analyze audio artifacts? We have one lesson on how to use Audacity to edit audio files and another on how to transform you transform your data into audio to better understand it. But you can do much more! How are you using tools to get quantifiable data about your audio artifacts? Or, how can you use machine learning techniques to produce new understandings of an audio collection? If you've used digital means to analyze audio we would love to hear from you.
I'm hopeful that a lead I have on an MEI lesson will pan out, but I've been coming up short with machine learning and general audio analysis. Feel free to edit in any way you'd like, and I'm happy to take another stab at it. The short paragraph I wrote above is a bit broad and vague, and I can be more specific if you'd like. But thought it might be worth trying to cast a wide net for this.
@walshbr So something similar to what some colleagues and I did here https://sro.sussex.ac.uk/71250/?
Yep exactly that'd be one good approach @drjwbaker! MIR would be great to have in the mix, if you, your colleagues, or anyone else wants to take a swing at it.
Another topic if possible:
Digital scholarly editions are usually modelled as XML documents. Although there are some publication tools such as TAPAS, editors often encounter that predefined solutions do not fit their rendering needs. Open source XML databases like eXistDB or BaseX offer a great flexibility for storing, indexing and retrieving hierarchical data and provide XQuery and XSLT as its query and application programming languages.
Ok so a list:
Yep exactly that'd be one good approach @drjwbaker! MIR would be great to have in the mix, if you, your colleagues, or anyone else wants to take a swing at it.
@walshbr If it makes the list, I can prod them.
I'm going to write another "lessons we'd like to see" blog post (original from 2017 : https://programminghistorian.org/posts/call-to-action). It's a good time of year to get people thinking about writing.
Please suggest topics for the post. 4 of the original 5 haven't actually ever been written. They are:
The one that was published was How do you conduct a stylometric analysis (well)? (https://programminghistorian.org/en/lessons/introduction-to-stylometry-with-python)
We can keep those 4 or we can drop any and all. Please express ideas if you have them. I don't want this to be a long process. I'll aim to publish early next week.