kbaseapps / kb_uploadmethods

MIT License
4 stars 21 forks source link

Plant taxon #329

Closed qzzhang closed 3 years ago

qzzhang commented 3 years ago

Description of PR purpose/changes

Jira Ticket:

https://kbase-jira.atlassian.net/browse/PUBLIC-1667

sonarcloud[bot] commented 3 years ago

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

No Coverage information No Coverage information
0.0% 0.0% Duplication

qzzhang commented 3 years ago

Tian: Thanks for your comment!

I thought of implementing the changes in GFU...but after I looked into GFU, the RE lookup has been in place. It only needs the client to pass the "taxon_id", like here: https://github.com/kbaseapps/GenomeFileUtil/blob/0b0b4f2ae0377fc111c4fcebaae53591f3919549/lib/GenomeFileUtil/core/GenbankToGenome.py#L240

The problem is on the uploader side the "taxon_id" is optional and the user may skip it hence GFU will skip the RE lookup.

Tianhao-Gu commented 3 years ago

All you did was updating params (assigning a scientific_name if missing) which will eventually be passed to GFU. I don't see why you cannot implement the changes in GFU directly.

The problem is on the uploader side the "taxon_id" is optional and the user may skip it hence GFU will skip the RE lookup. --- I don't get why the problem is on the uploader's side. If 'taxon_id' can be optional for GFU, why the uploader should enforce users now? If your goal was to let set_taxon_data in GFU always run, you should still move get_scientific_name_for_NCBI_taxon to GFU and have the uploader pass params (with ncbi_taxon_id and relation_engine_timestamp_ms) to GFU. The logic should not be implemented in the uploader. The uploader should only pass params to lower-level util (GFU) and having lower-level util perform the logic.

qzzhang commented 3 years ago

I think you've got your point. But I would wait for others to chime in.

Tianhao-Gu commented 3 years ago

Can you also summarize what you are trying to accomplish? In UI, you want to enforce scientific_name. But all your newly added logic does was assigning the scientific_name which will overwrite the UI input. You also removed taxon_id from UI. But you mentioned you want GFU to run set_taxon_data which requires taxon_id. The UI also doesn’t have options for ncbi_taxon_id or the timestamp. So new logic get_scientific_name_for_NCBI_taxon will never be called.

qzzhang commented 3 years ago

First of all, I agree with your idea of not implementing the lookup function inside kb_uploadmethods. I have removed it!

Secondly, the UI changes are needed in kb_uploadmethods such that the 'scientific_name' input will fetch from the dynamic dropdown the data for both params['ncbi_taxon_id'] and parms['relation_engine_timestamp_ms']. I removed the taxon_id UI input because it will be overwritten by the 'ncbi_taxon_id'.

Thirdly, I have added code to assign params['ncbi_taxon_id'] to params['taxon_id'] and parms['relation_engine_timestamp_ms'] to params['time_stamp'] in order to pass them to GFU.

To summarize: What I am trying to do is to ensure the 'scientific_name' is correctly fetched for a given genome input file after the uploading. Previously there are, in the UI, 'scientific_name' and 'taxon_id' both optional and editable textboxes. So if the user types in a scientific_name, the 'scientific_name' will be overwritten by GFU when a 'taxon_id' is given. If a 'taxon_id' is not given, GFU will 'set_default_taxon_data'. Either way, there maybe chance the 'scientific_name' ended up with not something the user expected--that's why we have the jira ticket. So, now I removed the 'taxon_id' input box. Instead, by implementing the dynamic dropdown for the 'scientific_name' input to let the use select an NCBI taxon_id (i.e., params['ncbi_taxon_id']) associated with the typed in 'scientific_name'. This way, we can avoid the occurrence of conflict 'scientific_name'.

qzzhang commented 3 years ago

I am about to push my new changes after the tests are completed.

qzzhang commented 3 years ago

As discussed above, it is better to implement the core changes in GFU instead of kb_uploadmethods, I am closing this PR and re-issue another to not carry the unwanted implementation portions.