Closed dlschwartz closed 4 years ago
@dlschwartz I will take a look. It looks like a SPARQL problem to me. Maybe the wrong things are getting counted.
Maybe, but there is also legacy data being visualized. It's complicated so I'll make another issue as I'm not sure how this relates to what I've done in this issue.
@wsalesky Things are looking pretty good. I can't see counts though because the sources aren't showing up:
Also, on the browse page, the persons and places are being moved to the end of the sentence:
However, the prose comes out fine on the factoid page:
@dlschwartz Still a work in progress, I ran into a nasty bug last night so I didn't get everything worked out as I wanted.
@wsalesky Very good. I wasn't sure where things stood but thought I'd offer some feedback. Thanks!
@dlschwartz As an update, I have rerun the data but am still getting bad counts, and I can not figure out what is causing the malformed labels, they show up fine when I run them one at a time. So, still debugging. Sorry it is taking so long.
Also, I think a future development will be to split the facets out to make them run faster (each facet being a single sparql request, right now they are all submitted together, which is slow.)
@wsalesky Not slow at all Winona. I'm worried I'm taking you away from your kids on the last week of their summer break. We'll get this sorted. Thanks Winona!
@dlschwartz I think I now have the correct data. Just need to troubleshoot the SPARQL queries.
@wsalesky Thanks! Things are looking great.
@dlschwartz I think it is fixed. However, the 'Persons' tab is a little odd. The facet counts are based on the number of person factoids, but the results just show the Person, so the counts look off if there are multiple matching factoids about a particular person. Do we want to show the factoids on this page instead of just the person?
Also, tomorrow I think I will try to speed up the facets I think if I make a single request for each facet that should do the trick, but it will take a little refactoring. I will do the work locally so I do not break anything on dev.
Let me know I missed anything!
@wsalesky This is an interesting problem. I'm not sure that the list of individual factoids is very useful to anyone. Dave tends to want this kind of thing though. I tend not to want it.
Options
Leave things the way they are. I don't think it's really a problem to display the results grouped in this way.
We could display the count of unique persons about whom we have person factoids instead of the count of factoids? That would be a count of unique values of //div/listPerson/person/persName/@ref. Here Dave does have a point about raw data vs. curated data.
We could work out some way to display in the browse results the person factoids grouped by the person they are about. If we were to go this way, I'm not sure we should do that right now.
I think that in the short run the options are 1 or 2. Perhaps we just leave the status quo and return to this issue later. Let me know if you have any additional thoughts on this.
@dlschwartz I need to do some more SPARQL experiments, I was aiming for option 2 but got some very odd results. I think step one is faster facets, step 2 will be to address this. I will probably hold off until next week unless you feel it is a real problem for your paper.
@wsalesky No problem. Holding off is fine.
@dlschwartz closing this as a stale issue.
@wsalesky There are a number of things that don't seem to be displaying correctly in the counts, or maybe they just aren't showing up in the display at all. I'm not sure how recent this is. I've thought that some things might be missing but I wasn't sure if it was a data problem or a bug. Now that this data is mostly cleaned it seems pretty clear it's either a bug or something I don't understand. I'm worried that this might take some effort to figure out. I think the first thing to check actually is that the re-running of the RDF worked. I'm not exactly sure how to check that.
Counts in Oxygen Lives 1074 divs/factoids 649 listPersons/person factoids 340 listEvents/event factoids 85 listRelations/relation factoids
Letters 1202 divs/factoids 712 listPersons/person factoids 159 listEvents/event factoids 331 listRelations/relation factoids
Chronicls 551divs/factoids 355 listPersons/person factoids 171 listEvents/event factoids 25 listRelations (not nested listEvents)/relation factoids
Compare with![all](https://user-images.githubusercontent.com/4984270/44348221-75342480-a45f-11e8-8e7e-b458f9f69f9e.JPG)