Sage-Bionetworks / schematic

Package for biomedical data model and metadata ingress management
https://schematicpy.readthedocs.io/en/stable/cli_reference.html
MIT License
22 stars 25 forks source link

Disconnect from the server during Synapse submission #377

Closed xindiguo closed 3 years ago

xindiguo commented 3 years ago

Describe the bug The metadata file has passed the validation but it showed an error message "Disconnect to the server" when trying to submit to Synapse

To Reproduce Steps to reproduce the behavior:

  1. Go to HIV Data Curator App
  2. Pick the 20201218_Pikachu folder in step 1
  3. Go to Submit & Validate Metadata tab (please ask Xindi for the metadata file)
  4. Upload, validate, and submit the file - see the error

Expected behavior The metadata should assign annotations to the files

Screenshots Screen Shot 2021-01-11 at 10 28 47 PM

Desktop (if applicable, please complete the following information):

Additional context Here is part of the log from the Shiny server

Warning: Error in py_call_impl: ParserError: Error tokenizing data. C error: Expected 25 fields in line 32, saw 26

Detailed traceback:
  File "/home/xguo/backend/schematic/schematic/synapse/store.py", line 405, in associateMetadataWithFiles
    manifest = pd.read_csv(metadataManifestPath)
  File "/home/xguo/anaconda3/envs/data_curator_env/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/xguo/anaconda3/envs/data_curator_env/lib/python3.7/site-packages/pandas/io/parsers.py", line 454, in _read
    data = parser.read(nrows)
  File "/home/xguo/anaconda3/envs/data_curator_env/lib/python3.7/site-packages/pandas/io/parsers.py", line 1133, in read
    ret = self._engine.read(nrows)
  File "/home/xguo/anaconda3/envs/data_curator_env/lib/python3.7/site-packages/pandas/io/parsers.py", line 2037, in read
    data = self._reader.read(nrows)
  File "pandas/_libs/parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
 [... truncated]
  76: <Anonymous>

Execution halted
xindiguo commented 3 years ago

Updates: the submission was successful when tested with the first 9 rows of the file.

xdoan commented 3 years ago

it looks like we're getting a similar error and it seems like a common error in now pandas/python parses CSVs and expects them to follow the first row's pattern...adding @milen-sage bc it seems relevant to back end.

sujaypatil96 commented 3 years ago

This issue was resolved in this PR.

xindiguo commented 3 years ago

The collaborator has ran into this error again using Safari Version 14.0.2. I will post the log later.

milen-sage commented 3 years ago

@xindiguo yes, please post the log snippet when you have a chance (could be a different issue than the above).

There is also another PR that @sujaypatil96 merged in 'main' about 2 hours ago; that might help, but would be nice to see the logs in any case.

xindiguo commented 3 years ago

The user was not certified.

milen-sage commented 3 years ago

@ychae is there a way to programmatically find if a user is certified (e.g. is there an error we can catch from the synapse client)?

milen-sage commented 3 years ago

ah - just saw the @xdoan has opened an issue on that https://github.com/Sage-Bionetworks/data_curator/issues/106 - we can track there.