moleculemaker / chemscraper-frontend

MIT License
1 stars 0 forks source link

Changes for the upcoming release include - original structure, updated UI, Similarity sort, Context overlay #34

Closed yashdeep97 closed 8 months ago

yashdeep97 commented 11 months ago

Changes:

  1. Add original structure created from PDF snippet to the result table.
  2. Remove the split view showing the PDF document alongside the results table.
  3. Add Original Context and open in PDF viewer button that opens an overlay.
  4. Add similarity sort for molecules.
  5. Add HCaptcha to prevent unauthorized use of APIs.
  6. Merge marvin.js changes and add similarity sort for substructure search.
  7. Collect user email to notify on job completion.

Approach:

  1. Create a new React component to display the original structure by creating a snippet of the PDF by passing the box coordinates.
  2. Disable the PDF component of the split view
  3. Reuse the PDF overlay component to implement the "View in context" section and overlay button.
  4. Utilize a new endpoint on the backend to implement similarity sorting and the the row order for the sorted molecules based on a given smile string.
  5. Generate new HCaptcha API key for chemscraper and configure it to be used only for staging and production environments. Disable it for local debugging and development using an environment variable.
  6. Merge new marvin.js changes from main and add a sort method that sorts the filtered rows by the similarity to the drawn substructure.
  7. Add a text field to collect user email in the configuration page.

How to test:

  1. Changes deployed to staging
  2. Run a new chemscraper job and observe results table to observe the original structure and the results table.
  3. Verify the that the following features work correctly:

    • Verify that the similarity sort works and also try searching and sorting using a substructure in marvin UI.
    • Verify that the "View in context" overlay works as expected, with the correct page of the PDF being rendered where the molecule is located.
    • Verify that email notification is received when job execution is complete.

    Known Issues:

    Button formatting and orientation not correct for "View in Context" and "Similarity sort" buttons

yashdeep97 commented 11 months ago

Example results: https://chemscraper.frontend.staging.mmli1.ncsa.illinois.edu/results/f73435b8-9d0f-4c15-a7a0-3b1a959d4390

yashdeep97 commented 11 months ago

Hi, Yashdeep! I tested this and it seems to work 👍

I found that when choosing "Export > CDXML > Single page" without choosing a page, you get back a 500 error. We might want to disable the Export button in this case

Thanks Sara for finding this bug! I will fix it before merging this PR.

yashdeep97 commented 10 months ago

I see that you've added logic that will disable the Export button when valid choices aren't made! Thank you, @yashdeep97! 😄

There might still be some problems with the export, but it may be difficult to track down with both the frontend + backend in flux? It looks like the CSV export is the trigger here, but I don't see any obvious reason why that might be happening.

Maybe best to get the new features merged up and then circle back with a fix?

Thanks Sara for reviewing the changes! I will update the PR description ASAP describing the new changes I pushed. Also, did you notice a new issue with exporting the CSV file?

bodom0015 commented 10 months ago

Hi @yashdeep97, sorry I should have provided more detail on this

Trying them one at a time:

Status Type Pages Additional Args Result
CDXML All pages -- File downloads successfully
⚠️ CDXML Single Page = 1 -- Works on staging, but not local - might just be because my local is missing example_PDF?
CSV Full Table -- 500 Error on both staging and local
CSV Current Table View no additional filters applied 500 Error on both staging and local

CSV example request body:

{
  "jobId": "d40952b0-b961-4463-8a58-8d912fd672b5",
  "cdxml": false,
  "cdxml_filter": "all_molecules",
  "cdxml_selected_pages": [],
  "csv": true,
  "csv_filter": "full_table",
  "csv_molecules": [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]
}

CSV response:

Internal Server Error

Server logs:

UnboundLocalError: local variable 'cdxml_file_data' referenced before assignment
INFO:     10.42.112.15:49692 - "POST /chemscraper/export-results HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 426, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/fastapi/applications.py", line 270, in __call__
    await super().__call__(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/applications.py", line 124, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 92, in __call__
    await self.simple_response(scope, receive, send, request_headers=headers)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/cors.py", line 147, in simple_response
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 706, in __call__
    await route.handle(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 236, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 162, in run_endpoint_function
    return await dependant.call(**values)
  File "/code/app/routers/chemscraper.py", line 142, in analyze_documents
    if cdxml_file_data is None:
UnboundLocalError: local variable 'cdxml_file_data' referenced before assignment

Using Swagger UI, we can see the same error when sending the request body above.

Given that it happens with Swagger UI as well, this might be an issue with mmli-backend 🤔

bodom0015 commented 10 months ago

Yes, this was an error on the server side that is present on the main branch... just a couple of minor typos here: https://github.com/moleculemaker/mmli-backend/blob/main/app/routers/files.py#L143-L153

It might have been my fault with a bad merge fix :pray: sorry about that! It is unrelated to both your frontend PR and Samarth's backend one

I have created a small PR to fix this: https://github.com/moleculemaker/mmli-backend/pull/25