simonsobs / librarian

The HERA Librarian.
BSD 2-Clause "Simplified" License
1 stars 2 forks source link

Upload verify file #Issue16 #58

Open JBorrow opened 6 months ago

JBorrow commented 6 months ago

In general, I think we also want to make the checksum-checking happen on the client side. We should ask the remote server for fresh checksums (checked against its own internals), that we can check ourselves. We do not want to send 'our' checksum to the remote server.

JBorrow commented 5 months ago

Right now you are not using the client to interact with a server, you are just mocking its methods. So you are not actually testing the server at all.

We already have a very extensive test suite that has functionality designed to do exactly this. You can see tests/integration_test/test_upload_file.py for example. That way the code in your client function is actually executed, and you would see that it does not work.

On Apr 3, 2024, at 3:57 PM, Mohammad Hasib Sheikh @.***> wrote:

@shaikhhasib commented on this pull request.

In tests/client_unit_test/test_cli.py https://github.com/simonsobs/librarian/pull/58#discussion_r1549916738:

+def test_verify_file_success(capsys):

  • Mock the client and its verify_file_row method

  • client_mock = MagicMock()
  • client_mock.verify_file_row.return_value = {"verified": True}
  • Mock the get_client function to return the mock client

  • with patch("hera_librarian.cli.get_client", return_value=client_mock):
  • Create a mock args object

  • args = MagicMock()
  • args.conn_name = "test_conn"
  • args.name = "test_name"
  • Call the verify_file function

  • cli.verify_file(args)
  • Assert that the verify_file_row method was called with the correct arguments

  • client_mock.verify_file_row.assert_called_once_with(name=args.name)
  • Check the printed output

  • captured = capsys.readouterr()
  • assert "File verification successful." in captured.out I do not see any issue with this!!, can you be more specific?

— Reply to this email directly, view it on GitHub https://github.com/simonsobs/librarian/pull/58#discussion_r1549916738, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3Z6G7F7PWOLOD4KHQNLR3Y3QKEHAVCNFSM6AAAAABEFOZEWGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNZXGA4TMNRWHE. You are receiving this because you were mentioned.

JBorrow commented 5 months ago

Yes, but if you take a look at the definition of Instance (orm/instance.py), you will see that there is a field available that tells you whether or not an instance of a file is actually available. There may be 100 instances associated with a file, but only one marked available. It doesn’t make much sense to me to try to verify file instances that we have already marked as unavailable; those checks would of course fail as we probably have moved the file!

On Apr 3, 2024, at 3:55 PM, Mohammad Hasib Sheikh @.***> wrote:

@shaikhhasib commented on this pull request.

In librarian_server/api/admin.py https://github.com/simonsobs/librarian/pull/58#discussion_r1549913458:

  • for instance in instances:
  • path = instance.path
  • checksum = get_md5_from_path(path)
  • size = get_size_from_path(path)
  • checksums_and_sizes.append(
  • {
  • "store_id": str(instance.store_id),
  • "checksum": checksum,
  • "size": str(size),
  • } What do you mean by Instance? We check the file instance in the above code!!

— Reply to this email directly, view it on GitHub https://github.com/simonsobs/librarian/pull/58#discussion_r1549913458, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3Z6G3HRXR5LUYUTZ67KDDY3QJ43AVCNFSM6AAAAABEFOZEWGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSNZXGA4TCMRVHA. You are receiving this because you were mentioned.