chenejac / VIVOTestMigrationJIRA

0 stars 0 forks source link

VIVO-1068: Remove features that do not work #1055

Closed chenejac closed 6 years ago

chenejac commented 9 years ago

Mike Conlon (Migrated from VIVO-1068) said:

Identify features that do not work as expected and remove links from templates that access these features. No need to remove the supporting code.

chenejac commented 9 years ago

Patrick West said:

VIVO 1.6 Map of Science does not work: https://info.deepcarbon.net/vivo/vis/map-of-science/da5b50ce3-2877-4e9f-9ba4-f645d811bf43. Hopefully fixed in 1.8

chenejac commented 9 years ago

Graham Triggs said:

Hi Patrick,

Map of Science was actually one of the cases that was discussed on Skype, as it's one of those interesting cases.

Technically, it does function (at least in 1.7 / 1.8) - however, it takes a very long time to process. And by very long time, I mean that for a significant dataset, you could request a Map of Science, and have to wait a couple of days for it to generate the cache!

I've not seen your specific example - where the browser times out. That may be something about the older versions (the current versions will respond with a page asking you to come back later).

But it's still questionable as to whether we can say that this "works" - because it's more than just a case of saying it doesn't generate an error and eventually completes. If something doesn't provide a useful function - which may well have a time aspect (e.g. it must complete in a time that a user is prepared to wait) - then for all practical purposes it doesn't work.

Thanks for the comment - this is exactly the type of feedback we are looking for.

chenejac commented 8 years ago

Jon Corson-Rikert said:

On the roadmap call 10/13, discussed removing or turning off:

  1. CSV ingest
  2. configure Map of Science to be turned off by default but optionally turned on

Other things? OpenSocial container? VIVO is about 2 versions behind what Profiles uses, so some gadgets would not work against APIs that may no longer work -- the fact that nobody uses it is yet is part of a case for removing it for now and bringing it back when we have some useful, working gadgets that demonstrate useful functionality

chenejac commented 8 years ago

Mike Conlon said:

My notes listed four things to be investigated and removed:

  1. Open Social (could make a come back in later release with aversion upgrade, Selenium Tests, delivered functional widgets)
  2. Map of Science (remove from default, provide instructions for adding it back in with cavetas/disclaimers)
  3. CSV upload (ancient attempt to provide upload for position data. A one off.)
  4. Person Harvester -- check with Jon F regarding state of Harvester, state of ontology. If not ready, remove and return in subsequent release when all attendants parts are assembled and tested.
chenejac commented 8 years ago

Graham Triggs said:

It's still a work in progress, needing a bit of a cleanup, but providing the work gets completed it's looking like Map of Science can be kept enabled by default:

For the UNAVCO dataset - 450 people, 4,000 articles, 250 journals - the following progress had been made:

1.8 initial time to generate initial visualization: 1 min 20 sec.

After replacing existing caching model with SPARQL Select queries for precisely the data that is required: 26 seconds

Not using the RDFServiceDatasetGraph, and issuing the query against RDFService instead: 10 seconds

Removing old createQueryExection code from RDFServiceSDB so that it executes on the dataset and not the default model: 1.4 seconds.

For a larger dataset - 145,000 people, 155,000 publications and 14,000 journals - the execution time is now 4 minutes 20 seconds.

I've added a new caching layer - currently, disconnected from the "rebuild visualization" control - which prevents multiple concurrent attempts to rebuild the cache, dynamically determines time to live on the cache based on query execution time (e.g. 30 seconds = 1 hour cache, max TTL 1 day), and for queries of less than 1 second, the results will be delivered "live",

Additionally, the cached data models are lightweight Maps / Strings / ints, not heavy Jena models (less memory usage), and any cache refreshes happen in the background - once the cache is initially created, page requests will always be delivered from the current state of the cache, so every Map of Science request will be < 1 second, even if it rebuilds the cache in the background for future requests.

chenejac commented 8 years ago

Mike Conlon said:

Moving speed up discussion to a new issue regarding map of science

chenejac commented 8 years ago

Mike Conlon said:

SPARQL Query Builder is another candidate for "does not work". This is very old. Likely not maintained.