PATRIC3 / patric3_website

Legacy PATRIC Website (JBoss Portal Version)
MIT License
5 stars 2 forks source link

Downloaded feature tables not in gene order #490

Open ARWattam opened 9 years ago

ARWattam commented 9 years ago

It looks like they are downloading sorted as to the peg.id, or something like that. For those genomes that do not have RefSeq or an old VBI locus tag, it is impossible to sort them in the correct order. Can't we please, PLEASE either give a column that lists the gene order, or not use peg.ids that end in peg.1, etc. Here's how it looks.... screen shot 2015-10-01 at 7 57 26 am

mshukla1 commented 9 years ago

Harry,

hyoo commented 7 years ago

This is mainly because of this. In order to utilize streaming inside data api, the query needs to be sorted by unique ID (feature_id in this case) and the feature ID is string and does not reflect natural order.

image

I cannot think of any workaround now. @dmachi do you have any idea?

mshukla1 commented 6 years ago

@dmachi

Any solution or action here? If not, please comment as such and close the ticket.

dmachi commented 6 years ago

As Harry said, this is because solr requires sorting on unique field in order to do the streaming. I don't currently have another solution.

ARWattam commented 6 years ago

I think that we need to provide a separate column that gives the number of the gene. RAST does this, primarily to allow sorting. The PATRIC locus tag makes it extremely difficult to sort. Yes, you can use the stop codon location, but if there are a number of contigs, users will have to realize this and sort accordingly. It would be so much easier if we just had a gene order column.

Rebecca

On Thu, Dec 7, 2017 at 12:23 PM, Dustin Machi notifications@github.com wrote:

As Harry said, this is because solr requires sorting on unique field in order to do the streaming. I don't currently have another solution.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-350036487, or mute the thread https://github.com/notifications/unsubscribe-auth/AK8ZdMOTOIFoDrnEb7GDgY3VzzKrC4lpks5s-B8HgaJpZM4GHNC9 .[image: Web Bug from https://github.com/notifications/beacon/AK8ZdP6Bjutny8AlyWih1d9leHx_63N5ks5s-B8HgaJpZM4GHNC9.gif]

hyoo commented 6 years ago

From the version of 6.x, solr has a new feature, named, streaming expression.

https://lucene.apache.org/solr/guide/6_6/streaming-expressions.html

I think this can be a solution for this issue. I am planning to work on this after Dec release.

rkenyon commented 6 years ago

@ARWattam this is not a fix per se, but the user can sort the Excel table on "Start" to get the correct syntenic order.

ARWattam commented 6 years ago

No you can't!!!!!!!!!! Just try sorting the peg ids. You have to manually go in and put them in the proper order. It is one of the most frustrating things that I have to do.

On Fri, Jul 13, 2018 at 4:00 PM, rkenyon notifications@github.com wrote:

@ARWattam https://github.com/ARWattam this is not a fix per se, but the user can sort the Excel table on "Start" to get the correct syntenic order.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-404938452, or mute the thread https://github.com/notifications/unsubscribe-auth/AK8ZdHp-2KX9RHt-48b-WTxQ8eaFmgH5ks5uGPxVgaJpZM4GHNC9 .[image: Web Bug from https://github.com/notifications/beacon/AK8ZdO3n-FpB2PwCtA-qnS_vumfsn_UVks5uGPxVgaJpZM4GHNC9.gif]

rkenyon commented 6 years ago

Sort first by strand (+, -) then by Start site, not PegID. A clunky workaround, but should work.

On Fri, Jul 13, 2018 at 4:04 PM ARWattam notifications@github.com wrote:

No you can't!!!!!!!!!! Just try sorting the peg ids. You have to manually go in and put them in the proper order. It is one of the most frustrating things that I have to do.

On Fri, Jul 13, 2018 at 4:00 PM, rkenyon notifications@github.com wrote:

@ARWattam https://github.com/ARWattam this is not a fix per se, but the user can sort the Excel table on "Start" to get the correct syntenic order.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-404938452 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AK8ZdHp-2KX9RHt-48b-WTxQ8eaFmgH5ks5uGPxVgaJpZM4GHNC9

.[image: Web Bug from

https://github.com/notifications/beacon/AK8ZdO3n-FpB2PwCtA-qnS_vumfsn_UVks5uGPxVgaJpZM4GHNC9.gif ]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-404939333, or mute the thread https://github.com/notifications/unsubscribe-auth/ADCnWlIfTvpiLMx2ogPoYg3eB0OJfFtyks5uGP0-gaJpZM4GHNC9 .[image: Web Bug from https://github.com/notifications/beacon/ADCnWullBWsgCo3IEpV3s4RbfihXlNFMks5uGP0-gaJpZM4GHNC9.gif]

-- Note: My email has changed to rkenyon@bi.vt.edu.

Ron Kenyon, Project Director Biocomplexity Institute of Virginia Tech PATRIC Project Manager www.patricbrc.org rkenyon@bi.vt.edu

mshukla1 commented 6 years ago

All PATRIC tables on the website are by default sorted by accession and then start site.

And I agree that, by default, the download table should keep the same sort order as that in the feature table it is downloaded from.

Harry, is this straight forward or are there any technical challenges?

-Maulik

On Jul 13, 2018, at 3:25 PM, rkenyon notifications@github.com wrote:

Sort first by strand (+, -) then by Start site, not PegID. A clunky workaround, but should work.

On Fri, Jul 13, 2018 at 4:04 PM ARWattam notifications@github.com wrote:

No you can't!!!!!!!!!! Just try sorting the peg ids. You have to manually go in and put them in the proper order. It is one of the most frustrating things that I have to do.

On Fri, Jul 13, 2018 at 4:00 PM, rkenyon notifications@github.com wrote:

@ARWattam https://github.com/ARWattam this is not a fix per se, but the user can sort the Excel table on "Start" to get the correct syntenic order.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-404938452 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AK8ZdHp-2KX9RHt-48b-WTxQ8eaFmgH5ks5uGPxVgaJpZM4GHNC9

.[image: Web Bug from

https://github.com/notifications/beacon/AK8ZdO3n-FpB2PwCtA-qnS_vumfsn_UVks5uGPxVgaJpZM4GHNC9.gif ]

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-404939333, or mute the thread https://github.com/notifications/unsubscribe-auth/ADCnWlIfTvpiLMx2ogPoYg3eB0OJfFtyks5uGP0-gaJpZM4GHNC9 .[image: Web Bug from https://github.com/notifications/beacon/ADCnWullBWsgCo3IEpV3s4RbfihXlNFMks5uGP0-gaJpZM4GHNC9.gif]

-- Note: My email has changed to rkenyon@bi.vt.edu.

Ron Kenyon, Project Director Biocomplexity Institute of Virginia Tech PATRIC Project Manager www.patricbrc.org rkenyon@bi.vt.edu — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-404944234, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLd73UcJKaa8NvggLZEYb930e_FGrA9ks5uGQI3gaJpZM4GHNC9.

hyoo commented 6 years ago

No. Check the original reason i described above. I checked streaming expression of solr 6, but this works only on cloud mode. So, we can't use it.

mshukla1 commented 6 years ago

Check this out, see if it is relevant.

Streaming Requests and Responses https://lucene.apache.org/solr/guide/6_6/streaming-expressions.html#StreamingExpressions-StreamingRequestsandResponses Solr has a /stream request handler that takes streaming expression requests and returns the tuples as a JSON stream. This request handler is implicitly defined, meaning there is nothing that has to be defined in solrconfig.xml - see Implicit RequestHandlers https://lucene.apache.org/solr/guide/6_6/implicit-requesthandlers.html#implicit-requesthandlers.

The /stream request handler takes one parameter, expr, which is used to specify the streaming expression. For example, this curl command encodes and POSTs a simple search() expression to the /stream handler:

curl --data-urlencode 'expr=search(enron_emails, q="from:1800flowers*", fl="from, to", sort="from asc", qt="/export")' http://localhost:8983/solr/enron_emails/stream Data Requirements https://lucene.apache.org/solr/guide/6_6/streaming-expressions.html#StreamingExpressions-DataRequirements Because streaming expressions relies on the /export handler, many of the field and field type requirements to use /export are also requirements for /stream, particularly for sort and fl parameters. Please see the section Exporting Result Sets https://lucene.apache.org/solr/guide/6_6/exporting-result-sets.html#exporting-result-sets for details.

Exporting Result Sets —— this has mode detailed on exporting results and sorting

https://lucene.apache.org/solr/guide/6_6/exporting-result-sets.html#exporting-result-sets https://lucene.apache.org/solr/guide/6_6/exporting-result-sets.html#exporting-result-sets

On Jul 16, 2018, at 6:17 AM, hyoo notifications@github.com wrote:

No. Check the original reason i described above. I checked streaming expression of solr 6, but this works only on cloud mode. So, we can't use it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/PATRIC3/patric3_website/issues/490#issuecomment-405215840, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLd78n0n1w_UhqP0yizds_kftqXxy_Tks5uHHY3gaJpZM4GHNC9.

hyoo commented 6 years ago

First line of the doc, "Streaming Expressions provide a simple yet powerful stream processing language for Solr Cloud."

mshukla1 commented 5 years ago

@hyoo

As we are setting up SolrCloud, we should see if we can solve this sorting issue for the download files.

-M