complexdatacollective / Server

A tool for storing, analyzing, and exporting Network Canvas interview data.
http://networkcanvas.com/
GNU General Public License v3.0
2 stars 2 forks source link

Update to use revised network-exporters submodule #293

Closed wwqrd closed 3 years ago

wwqrd commented 3 years ago

This updates the export_requests endpoint (ExportManager) to export using the now self contained network-exporters. Changes were also required in the network-exporters module to support Electron main (which uses commonjs, and different electron imports).

Settings have been updated to match update module. Some sections expand only when other options are selected, e.g. the CSV formats:

export_settings

Export progress is provided over IPC to the ExportModal which previously only showed a spinner:

export_progress

~Related tests have been updated to pass, but there are outstanding failing tests on master that I haven't tackled.~

Tests on master are only failing with these updated npm packages. Needs to be resolved before merging.

Resolves https://github.com/complexdatacollective/Server/issues/287 Also fixes https://github.com/complexdatacollective/Server/issues/274

rebeccamadsen commented 3 years ago

I get "invalid export options" when I try to export, or if I unselect the "csv" option, I get "Please select at least one file type to export for CSV". This is with a Development protocol, and both data from NC as well as data from generating test sessions. What am I missing?

wwqrd commented 3 years ago

@rebeccamadsen That should be working now. Have also added a warning for no formats selected.

rebeccamadsen commented 3 years ago

This helps! I can export data successfully. Some things I noticed:

wwqrd commented 3 years ago

I exported 10 sessions I had, and then sent another session from my installed version of NC. When I exported from Server again, I only got the newest session, not all of them. I wonder if this is a mismatch in generated sessions vs from-NC sessions? Or possibly a mismatch in the protocols? If I then exported a session from my dev NC, all my sessions get exported from Server again. Very odd.

I have an inkling of what's going on here, but not 100% sure.

First of all, the in app session generation does not work. It generates sessions in the old format. My understanding is that these protocols will not export at all, but I could be mistaken in the following case:

The new exporter uses the NC generated protocol IDs, so we have to do some wrangling in the server to get those IDs out of the session data itself. The assumption being made is that all the sessions match the protocol id of the first session (correct me if that's wrong - it looks like the exporters code can support multiple protocols @jthrilly?), so it may be that when it's working it's finding the protocol id successfully in the NC generated data -- though for me that doesn't explain it exporting only part of the data.

jthrilly commented 3 years ago

The new exporter uses the NC generated protocol IDs, so we have to do some wrangling in the server to get those IDs out of the session data itself

Does it? I thought it just expected to find any protocol ID in the sessions collection as a key in the protocols collection? That should make it relatively implementation agnostic.

First of all, the in app session generation does not work. It generates sessions in the old format. My understanding is that these protocols will not export at all, but I could be mistaken in the following case:

I think you've got this the wrong way around! NC is generating in the new format, and Server is generating in the old format.

jthrilly commented 3 years ago

I'm going to look at this now, but my general feeling is that there are too many moving parts in terms of session format changes, protocol changes, export format changes, etc.

I'm going to test a really simple case of Server exporting a test session generated within Server in all of the formats supported by the new network exporters submodule. If it meets that criteria, I think we merge. Subsequent issues should be dealt with separately as they aren't about this specific implementation.

wwqrd commented 3 years ago

I think you've got this the wrong way around! NC is generating in the new format, and Server is generating in the old format.

By in app I mean in server! I've pushed an update which fixes it.

rebeccamadsen commented 3 years ago

I could not quite export the CSV for 4500 sessions (windows? Or I need a new computer?). I could export the graphml, but not csv -- not even just the attributes. Is this an error we could catch and provide feedback to the user? It just crashed my Server UI, and I pulled this from the command line:

14:10:55.782 > update { progress: 60, statusText: 'Creating zip archive...' }
events.js:200
      throw er; // Unhandled 'error' event
      ^

Error: EMFILE: too many open files, open 'C:\Users\BIGBAL~1\AppData\Local\Temp\temp-export-e1230a64-7e6a-48b4-b9a8-5f231ccc89c5\ex_recusandae_blanditiis_54320_0f8599d6-dd33-47de-921c-35b54f51ed26_attributeList_person.csv'
Emitted 'error' event on ReadStream instance at:
    at internal/fs/streams.js:120:12
    at FSReqCallback.oncomplete (fs.js:146:23) {
  errno: -4066,
  code: 'EMFILE',
  syscall: 'open',
  path: 'C:\\Users\\BIGBAL~1\\AppData\\Local\\Temp\\temp-export-e1230a64-7e6a-48b4-b9a8-5f231ccc89c5\\ex_recusandae_blanditiis_54320_0f8599d6-dd33-47de-921c-35b54f51ed26_attributeList_person.csv'
}

Above this were a bunch of similar complaints that it couldn't write to the main.log file because too many files were open. This was with 'Exporting 4482 of 4496 sessions...', since I deleted a few trying to find a threshold that wouldn't crash. It didn't include the 14 sessions I had from previous means and stopped at 4482.

This is the log error:

14:47:43.114 > electron-log.transports.file: Can't write to C:\Users\Big Baloo\AppData\Roaming\Network Canvas Server\logs\main.log Error: Couldn't write to C:\Users\Big Baloo\AppData\Roaming\Network Canvas Server\logs\main.log. EMFILE: too many open files, open 'C:\Users\Big Baloo\AppData\Roaming\Network Canvas Server\logs\main.log'
    at File.writeLine (C:\Users\Big Baloo\Documents\GitHub\Server\node_modules\electron-log\src\transports\file\file.js:138:7)
    at transport (C:\Users\Big Baloo\Documents\GitHub\Server\node_modules\electron-log\src\transports\file\index.js:67:10)
    at runTransport (C:\Users\Big Baloo\Documents\GitHub\Server\node_modules\electron-log\src\log.js:44:5)
    at runTransports (C:\Users\Big Baloo\Documents\GitHub\Server\node_modules\electron-log\src\log.js:27:7)
    at log (C:\Users\Big Baloo\Documents\GitHub\Server\node_modules\electron-log\src\log.js:21:3)
    at EventEmitter.<anonymous> (C:\Users\Big Baloo\Documents\GitHub\Server\electron-dev\server\AdminService.js:302:20)
    at EventEmitter.emit (C:\Users\Big Baloo\Documents\GitHub\Server\node_modules\eventemitter3\index.js:181:35)
    at FileExportManager.emit (C:\Users\Big Baloo\Documents\GitHub\Server\electron-dev\utils\network-exporters\src\FileExportManager.js:74:17)
    at C:\Users\Big Baloo\Documents\GitHub\Server\electron-dev\utils\network-exporters\src\FileExportManager.js:204:26
    at Array.map (<anonymous>)

We could ask the user to unify the networks, or to select less csv attributes, but a threshold will still be reached at some point. Exporting a unified network worked for me.

Also, the "cancel" button is unresponsive when exporting this many files. I'm not sure anything can be done about that because the whole app was unresponsive. I was able to cancel when I had significantly less files.

jthrilly commented 3 years ago

Maybe using graceful-fs would work? https://stackoverflow.com/a/49673553

wwqrd commented 3 years ago

We already use fs-extra (which uses graceful-fs under the hood), so perhaps that would be worth swapping to?

Is this going to be a problem in NC/devices? I doubt cordova would be able to handle that kind of export, but if it did, it uses different file methods.

jthrilly commented 3 years ago

Is this going to be a problem in NC/devices? I doubt cordova would be able to handle that kind of export, but if it did, it uses different file methods.

It might be a problem there, but not for the same reason (as you pointed out, the FS stuff is completely different, and we use the most "native" methods we have access to without writing our own cordova plugin). Main thing is that export size will not typically exceed 5 interviews, and will almost certainly never exceed something like 50.

jthrilly commented 3 years ago

I think we should try with fs-extra since it is a drop in replacement for fs. Should be a find and replce job. If that doesn't fix it, let's merge anyway and list it as a known issue.