georgeamccarthy / protein_search

The neural search engine for proteins.
GNU Affero General Public License v3.0
15 stars 6 forks source link

Dataset #22

Closed fissoreg closed 3 years ago

fissoreg commented 3 years ago

🏆 Enhancements

Purpose

Why?

Feedback required over

References (OPTIONAL)

fissoreg commented 3 years ago

@georgeamccarthy this is kind of ready, it could be useful for the fronted integration. It's a draft cause I would like to add some tests and polish the code a bit.

fissoreg commented 3 years ago

This might fix #21 if we agree on the directories management.

georgeamccarthy commented 3 years ago

Could you please put a link somewhere e.g. in the README of where this dataset is from? I couldn't find its source by following the download link.

fissoreg commented 3 years ago

Sure, we should update the README. For now the link is in #8

fissoreg commented 3 years ago

Tests are still missing but I'm gonna merge this one to work on the frontend. I'll open an issue for the tests.

Rubix982 commented 3 years ago

The PR looks okay. I didn't understand somethings since I'm unfamiliar with jina.

Just one thing, @fissoreg , do you think this PR is too 'large'? It touches 10 files together over 8 commits (excluding merge from main). Maybe @georgeamccarthy can better give an idea of this, since again, I'm unfamiliar with the code base.

The commit messages can be improved, 'Frontend layout' seems unsure, 'Added frontend layouts' seems better. Also no periods at the end of the commit messages.

And another commit, Missing file. was confusing as it only adds 3 lines and I didn't really understand the connection of the commit message with the code written.

PS: Sorry for the 2 reviews below. GitHub is acting buggy.

georgeamccarthy commented 3 years ago

The PR looks okay. I didn't understand somethings since I'm unfamiliar with jina.

Just one thing, @fissoreg , do you think this PR is too 'large'? It touches 10 files together over 8 commits (excluding merge from main). Maybe @georgeamccarthy can better give an idea of this, since again, I'm unfamiliar with the code base.

The commit messages can be improved, 'Frontend layout' seems unsure, 'Added frontend layouts' seems better. Also no periods at the end of the commit messages.

And another commit, Missing file. was confusing as it only adds 3 lines and I didn't really understand the connection of the commit message with the code written.

PS: Sorry for the 2 reviews below. GitHub is acting buggy.

We haven't agreed on a naming convention for commits, PRs and the size of each of these. We could do that moving forward. I agree that the commit messages could be clearer as well as potentially splitting this into two PRs, leaving the frontend to another or change the title to e.g. Add new dataset and basic frontend.