providedh / collaborative-platform

Collaboration made easy
GNU Affero General Public License v3.0
0 stars 1 forks source link

Annotator ids are assumed to have a specific format. #165

Closed Janchorizo closed 4 years ago

Janchorizo commented 4 years ago

The following error results from the server trying to get an integer from the annotator name when it is not the case that it follows the annotator-<id> format.

I don't think this is the expected behavior. It does not feel right that having annotators using a different naming prevents that page from been used (won't even show in the files manager); maybe handling those ids during upload would work or at least check any assumptions made for file format and prevent the upload from happening.

Error:

web_1            | Traceback (most recent call last):
web_1            |   File "/code/src/collaborative_platform/apps/dataset_stats/views.py", line 36, in stats
web_1            |     files = helpers.files_for_project_version(project_id, project_version)
web_1            |   File "/code/src/collaborative_platform/apps/dataset_stats/helpers.py", line 86, in files_for_project_version
web_1            |     file_contents = tuple((get_content(fv), fv.file.name) for fv in project_version.file_versions.all())
web_1            |   File "/code/src/collaborative_platform/apps/dataset_stats/helpers.py", line 86, in <genexpr>
web_1            |     file_contents = tuple((get_content(fv), fv.file.name) for fv in project_version.file_versions.all())
web_1            |   File "/code/src/collaborative_platform/apps/dataset_stats/helpers.py", line 84, in <lambda>
web_1            |     get_content = lambda fv: clean_xml(fv.get_rendered_content())
web_1            |   File "/code/src/collaborative_platform/apps/files_management/models.py", line 228, in get_rendered_content
web_1            |     content = file_renderer.render_file_version(self)
web_1            |   File "/code/src/collaborative_platform/apps/files_management/file_conversions/file_renderer.py", line 32, in render_file_version
web_1            |     self.__append_annotators()
web_1            |   File "/code/src/collaborative_platform/apps/files_management/file_conversions/file_renderer.py", line 325, in __append_annotators
web_1            |     users_ids = self.__get_annotators_ids_of_xml_elements()
web_1            |   File "/code/src/collaborative_platform/apps/files_management/file_conversions/file_renderer.py", line 341, in __get_annotators_ids_of_xml_elements
web_1            |     annotators_ids = [int(annotator.replace('#annotator-', '')) for annotator in annotators]
web_1            |   File "/code/src/collaborative_platform/apps/files_management/file_conversions/file_renderer.py", line 341, in <listcomp>
web_1            |     annotators_ids = [int(annotator.replace('#annotator-', '')) for annotator in annotators]
web_1            | ValueError: invalid literal for int() with base 10: '#recipe_levenshtein_matching'

The XML content that causes this is the following:

    <profileDesc>
      <particDesc>
        <listPerson type="PROVIDEDH Annotators">
          <person xml:id="recipe_levenshtein_matching">
            <persName>
              <forename>Algorithm</forename>
              <surname>Levenshtein distance match</surname>
              <email/>
            </persName>
          </person>
        </listPerson>
      </particDesc>
      <textClass>
        <classCode scheme="http://providedh.eu/uncertainty/ns/1.0"> <certainty 
             ana="https://providedh-test.ehum.psnc.pl/api/projects/17/taxonomy/#incompleteness" 
             locus="value" 
             degree="0.8571428571428571"
             cert="high"
             resp="#recipe_levenshtein_matching" 
             match="@ref"
             target="#name-0"
             xml:id="cert-0" 
             assertedValue="#sand"/>
bug-rancher commented 4 years ago

Backend does not require the specified format of annotator's xml:id - it automatically converts all xml:ids to the appropriate version during file upload. The problem was elsewhere - during rendering the first non-converted version of the file. Already fixed.