techlib / celus

Celus is a web application for harvesting and visualizing usage statistics of electronic information sources
MIT License
17 stars 5 forks source link

Manual data uploads not working #11

Closed Vinc89 closed 3 years ago

Vinc89 commented 3 years ago

Dear developers, many thanks for providing Celus. I have provided a test installation for our library and they like it!

There is an issue with manual data uploads. When trying to upload a .csv file we get these errors in the log.

[Thu Jul 29 16:20:37.102147 2021]  Internal Server Error: /api/manual-data-upload/4/preflight/
[Thu Jul 29 16:20:37.102170 2021]  Traceback (most recent call last):
[Thu Jul 29 16:20:37.102173 2021]    File "/opt/celus/apps/logs/views.py", line 266, in preflight_check
[Thu Jul 29 16:20:37.102183 2021]      stats = custom_import_preflight_check(mdu)
[Thu Jul 29 16:20:37.102185 2021]    File "/opt/celus/apps/logs/logic/custom_import.py", line 137, in custom_import_preflight_check
[Thu Jul 29 16:20:37.102187 2021]      records = list(mdu.data_to_records())  # type: [CounterRecord]
[Thu Jul 29 16:20:37.102189 2021]    File "/opt/celus/apps/nigiri/counter5.py", line 229, in file_to_records
[Thu Jul 29 16:20:37.102191 2021]      for rec in self._fd_to_records(infile):
[Thu Jul 29 16:20:37.102193 2021]    File "/opt/celus/apps/nigiri/counter5.py", line 244, in _fd_to_records
[Thu Jul 29 16:20:37.102195 2021]      header[header_line[0]] = header_line[1]
[Thu Jul 29 16:20:37.102196 2021]  IndexError: list index out of range
[Thu Jul 29 16:20:37.102198 2021]  
[Thu Jul 29 16:20:37.102200 2021]  During handling of the above exception, another exception occurred:
[Thu Jul 29 16:20:37.102202 2021]  
[Thu Jul 29 16:20:37.102203 2021]  Traceback (most recent call last):
[Thu Jul 29 16:20:37.102205 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
[Thu Jul 29 16:20:37.102207 2021]      response = get_response(request)
[Thu Jul 29 16:20:37.102209 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
[Thu Jul 29 16:20:37.102211 2021]      response = self.process_exception_by_middleware(e, request)
[Thu Jul 29 16:20:37.102213 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
[Thu Jul 29 16:20:37.102215 2021]      response = wrapped_callback(request, *callback_args, **callback_kwargs)
[Thu Jul 29 16:20:37.102216 2021]    File "/usr/lib/python3.7/contextlib.py", line 74, in inner
[Thu Jul 29 16:20:37.102218 2021]      return func(*args, **kwds)
[Thu Jul 29 16:20:37.102220 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
[Thu Jul 29 16:20:37.102222 2021]      return view_func(*args, **kwargs)
[Thu Jul 29 16:20:37.102224 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/rest_framework/viewsets.py", line 114, in view
[Thu Jul 29 16:20:37.102226 2021]      return self.dispatch(request, *args, **kwargs)
[Thu Jul 29 16:20:37.102230 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/rest_framework/views.py", line 505, in dispatch
[Thu Jul 29 16:20:37.102232 2021]      response = self.handle_exception(exc)
[Thu Jul 29 16:20:37.102233 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/rest_framework/views.py", line 465, in handle_exception
[Thu Jul 29 16:20:37.102235 2021]      self.raise_uncaught_exception(exc)
[Thu Jul 29 16:20:37.102237 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/rest_framework/views.py", line 476, in raise_uncaught_exception
[Thu Jul 29 16:20:37.102239 2021]      raise exc
[Thu Jul 29 16:20:37.102241 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/rest_framework/views.py", line 502, in dispatch
[Thu Jul 29 16:20:37.102243 2021]      response = handler(request, *args, **kwargs)
[Thu Jul 29 16:20:37.102244 2021]    File "/opt/celus/apps/logs/views.py", line 270, in preflight_check
[Thu Jul 29 16:20:37.102246 2021]      mail_admins('MDU preflight check error', body)
[Thu Jul 29 16:20:37.102248 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/mail/__init__.py", line 101, in mail_admins
[Thu Jul 29 16:20:37.102250 2021]      mail.send(fail_silently=fail_silently)
[Thu Jul 29 16:20:37.102252 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/mail/message.py", line 306, in send
[Thu Jul 29 16:20:37.102254 2021]      return self.get_connection(fail_silently).send_messages([self])
[Thu Jul 29 16:20:37.102256 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/mail/backends/smtp.py", line 103, in send_messages
[Thu Jul 29 16:20:37.102257 2021]      new_conn_created = self.open()
[Thu Jul 29 16:20:37.102259 2021]    File "/opt/virtualenvs/celus/lib/python3.7/site-packages/django/core/mail/backends/smtp.py", line 63, in open
[Thu Jul 29 16:20:37.102261 2021]      self.connection = self.connection_class(self.host, self.port, **connection_params)
[Thu Jul 29 16:20:37.102263 2021]    File "/usr/lib/python3.7/smtplib.py", line 251, in __init__
[Thu Jul 29 16:20:37.102265 2021]      (code, msg) = self.connect(host, port)
[Thu Jul 29 16:20:37.102267 2021]    File "/usr/lib/python3.7/smtplib.py", line 336, in connect
[Thu Jul 29 16:20:37.102268 2021]      self.sock = self._get_socket(host, port, self.timeout)
[Thu Jul 29 16:20:37.102270 2021]    File "/usr/lib/python3.7/smtplib.py", line 307, in _get_socket
[Thu Jul 29 16:20:37.102274 2021]      self.source_address)
[Thu Jul 29 16:20:37.102276 2021]    File "/usr/lib/python3.7/socket.py", line 727, in create_connection
[Thu Jul 29 16:20:37.102278 2021]      raise err
[Thu Jul 29 16:20:37.102280 2021]    File "/usr/lib/python3.7/socket.py", line 716, in create_connection
[Thu Jul 29 16:20:37.102281 2021]      sock.connect(sa)
[Thu Jul 29 16:20:37.102283 2021]  OSError: [Errno 101] Network is unreachable

Our installation is running on Debian 10, but that should not be an issue, right?

Not sure if there is a problem with the uploaded file itself or the network. It would be great if you could point me in the right direction. Let me know if you need more information.

beda42 commented 3 years ago

Hi @Vinc89 , thanks for submitting this issue. Could you please attach the file that is giving you trouble? Also, which branch do you use? There were some small fixes in staging related to C5 manual import.

patrickda commented 3 years ago

HI just to make the communication a bit more direct. I'm the end user who imported the data @Vinc89 is our developer. I nearly got problems with all csv files I uploaded. In the front end i see always the error: " Error loading preflight data: Error: Request faild with status code 500"

I tried to upload a file which was end to me from Web of Science which I converted with a Word on Mac to csv. And a downloaded file from Scopus. Both are attached. IST Austria - WoS usage Feb 2020 - 35 months backwards - Counter 4.csv Source_Elsevier_Scopus_dr_d1_2019.csv

beda42 commented 3 years ago

Hi,

the first file uses semicolons ";" rather than "," as separators in CSV, which probably causes trouble.

The second file seems to be formatted correctly, but the problem is that it is for the "DR_D1" report, which is a view of the master "DR" report. Unfortunately, Celus currently only works with master reports for COUNTER 5 and computes the data for the views on the fly from them. Therefore it cannot load the DR_D1 report. If you could get the whole master DR report, it should be possible to upload it.

patrickda commented 3 years ago

Thanks a lot will try tomorrow. As a feature request some more expressive error messages would be helpful ;-) (I know that always an issue with resources )

patrickda commented 3 years ago

Now it works thanks that the sub reports are not working and the csv with commas only is a bit confusing. Can semicolons maybe added to the support format? And on more questions how is it with counter4? Are there the JR1/JR1a supported? Thanks again

beda42 commented 3 years ago

I am glad it works now.

Yes, Counter 4 formats should be supported.

As for supporting semicolons - in COUNTER 5 we try to guess the delimiter, but we mostly concentrate on TSV (Tab Separated Values) and CSV (Coma Separated Values) formats as the former is required by COUNTER COP and the latter is the most widely used. You can test using semicolons in C5 reports, it might work. In C4 we use an external library for reading the files which supports TSV and CSV. We do not plan to extend this library, especially now that C4 is obsolete.