Closed utku-ozturk closed 5 months ago
Looks good! For experiments link, the first column also includes title e.g. "multiplexed FISH on ES-E14TG2a with Rad21-AID - 4DNEXRINUWBN". Is it better to have just the accession "4DNEXRINUWBN", similar to other browser pages?
In re Rahi's comment - the first column is the title that is returned by default as the first column for all searches and I think it is fine to leave as it is - no change needed there.
Utku - I tested the queries you provided above on data and it looks like the Experiment query does return the same number as shown for the count at the top of the home page - if you see a discrepancy can you provide more info so I can look into it.
For the File search - can you adjust the subquery that you were showing at the meeting this morning to include the 'other_processed_files' linked to experiments and replicate sets as these should be included in the counts in any case and see if there is still a difference in the search and aggregated counts? Thanks.
@aschroed Thanks for the feedback. FYI, we adjusted the ES query to include ExpSet and Exp's OPF as below:
"total_expset_other_processed_files" : {
"cardinality" : {
"field" : "embedded.other_processed_files.files.accession.raw",
"precision_threshold" : 10000
}
},
"total_exp_other_processed_files" : {
"cardinality" : {
"field" : "embedded.experiments_in_set.other_processed_files.files.accession.raw",
"precision_threshold" : 10000
}
}
Now the total files count in home page jumped from 39721
to 46051
, that exceeds the /search result count - 40932
. We are still investigating other alternatives to eliminate any possible duplicates.
This looks good to me. Did Will have a look at the queries to ensure the performance is OK or did you change the approach so that is no longer a concern?
@aschroed We initially attempted to split the ES query into three sub-queries (raw, processed, and OPF). But the memory impact would be a big concern, and counts are still approximations. Then, we reverted all of them and added new aggregations to the existing query, which looks sufficient. The first approach definitely would require Will's feedback, but the current approach does not, in my opinion.
Trello: https://trello.com/c/JngcASPz
QuickInfoBar
experiment and file links from /browse to /search/search/?type=Experiment&experiment_sets.experimentset_type=replicate&experiment_sets.@type=ExperimentSetReplicate
/search/?type=File&experiments.experiment_sets.@type=ExperimentSetReplicate&experiments.experiment_sets.experimentset_type=replicate
/search/?type=File&track_and_facet_info.replicate_info!=No+value