Open jttkim opened 10 years ago
Hi Jan, I suspect that the biomaRt package is still running against the old perl code base not the java one (Steffen would have to confirm it)
a
On 30 June 2014 19:03, Jan T. Kim notifications@github.com wrote:
Some code of mine recently ran into trouble while processing sequences I retrieved from biomart using the Bioconductor biomaRt package. The reason for this turned out that a few entries contained the message "Sequence unavailable" rather than a valid sequence (see [1] for my posting reporting this). After email discussions with a biomaRt author and looking into the BioMart code I think the class generating these entries likely is org.biomart.processors.sequence.Sequence, and its done() method in particular [2].
I currently can't give an account of how attempts to retrieve a "coding" sequence for genes that don't have one yield a result containing the "unavailable" message. However, if my tentative analysis is correct, my suggestion would be to not return any result, rather than one containing a message instead of a sequence.
[1] https://stat.ethz.ch/pipermail/bioconductor/2014-June/060269.html [2] https://github.com/biomart/BioMart/blob/master/plugins/sequence/src/org/biomart/processors/sequence/Sequence.java , line 275
Reply to this email directly or view it on GitHub https://github.com/biomart/BioMart/issues/1.
Dear Steffen,
can you comment on Arek's response below?
Best regards & thanks in advance, Jan
On Tue, Jul 01, 2014 at 11:38:23PM -0700, Arek Kasprzyk wrote:
Hi Jan, I suspect that the biomaRt package is still running against the old perl code base not the java one (Steffen would have to confirm it)
a
On 30 June 2014 19:03, Jan T. Kim notifications@github.com wrote:
Some code of mine recently ran into trouble while processing sequences I retrieved from biomart using the Bioconductor biomaRt package. The reason for this turned out that a few entries contained the message "Sequence unavailable" rather than a valid sequence (see [1] for my posting reporting this). After email discussions with a biomaRt author and looking into the BioMart code I think the class generating these entries likely is org.biomart.processors.sequence.Sequence, and its done() method in particular [2].
I currently can't give an account of how attempts to retrieve a "coding" sequence for genes that don't have one yield a result containing the "unavailable" message. However, if my tentative analysis is correct, my suggestion would be to not return any result, rather than one containing a message instead of a sequence.
[1] https://stat.ethz.ch/pipermail/bioconductor/2014-June/060269.html [2] https://github.com/biomart/BioMart/blob/master/plugins/sequence/src/org/biomart/processors/sequence/Sequence.java , line 275
Reply to this email directly or view it on GitHub https://github.com/biomart/BioMart/issues/1.
Reply to this email directly or view it on GitHub: https://github.com/biomart/BioMart/issues/1#issuecomment-47742005
+- Jan T. Kim -------------------------------------------------------+ | email: jttkim@gmail.com | | WWW: http://www.jtkim.dreamhosters.com/ | -----=< hierarchical systems are for files, not for humans >=-----
Hi Jan, Arek,
We're indeed still querying BioMart 0.7, the update that queries 0.9 should be available in the Bioconductor devel repository soon.
Best, Steffen
On Wed, Jul 2, 2014 at 3:01 AM, Jan Kim jttkim@googlemail.com wrote:
Dear Steffen,
can you comment on Arek's response below?
Best regards & thanks in advance, Jan
On Tue, Jul 01, 2014 at 11:38:23PM -0700, Arek Kasprzyk wrote:
Hi Jan, I suspect that the biomaRt package is still running against the old perl code base not the java one (Steffen would have to confirm it)
a
On 30 June 2014 19:03, Jan T. Kim notifications@github.com wrote:
Some code of mine recently ran into trouble while processing sequences I retrieved from biomart using the Bioconductor biomaRt package. The reason for this turned out that a few entries contained the message "Sequence unavailable" rather than a valid sequence (see [1] for my posting reporting this). After email discussions with a biomaRt author and looking into the BioMart code I think the class generating these entries likely is org.biomart.processors.sequence.Sequence, and its done() method in particular [2].
I currently can't give an account of how attempts to retrieve a "coding" sequence for genes that don't have one yield a result containing the "unavailable" message. However, if my tentative analysis is correct, my suggestion would be to not return any result, rather than one containing a message instead of a sequence.
[1] https://stat.ethz.ch/pipermail/bioconductor/2014-June/060269.html [2]
Reply to this email directly or view it on GitHub https://github.com/biomart/BioMart/issues/1.
Reply to this email directly or view it on GitHub: https://github.com/biomart/BioMart/issues/1#issuecomment-47742005
+- Jan T. Kim -------------------------------------------------------+ | email: jttkim@gmail.com | | WWW: http://www.jtkim.dreamhosters.com/ | -----=< hierarchical systems are for files, not for humans >=-----
Dear Steffen, dear Arek,
On Wed, Jul 02, 2014 at 08:58:57AM -0700, Steffen Durinck wrote:
Hi Jan, Arek,
We're indeed still querying BioMart 0.7, the update that queries 0.9 should be available in the Bioconductor devel repository soon.
ok -- thanks for looking into this, so let's wait for the new version to filter through.
Weeding out the "Sequence unavailable" entries is entirely ok for me for now, I just wanted to minimise the amount of replication of such effort around the globe.
Best regards, Jan
Best, Steffen
On Wed, Jul 2, 2014 at 3:01 AM, Jan Kim jttkim@googlemail.com wrote:
Dear Steffen,
can you comment on Arek's response below?
Best regards & thanks in advance, Jan
On Tue, Jul 01, 2014 at 11:38:23PM -0700, Arek Kasprzyk wrote:
Hi Jan, I suspect that the biomaRt package is still running against the old perl code base not the java one (Steffen would have to confirm it)
a
On 30 June 2014 19:03, Jan T. Kim notifications@github.com wrote:
Some code of mine recently ran into trouble while processing sequences I retrieved from biomart using the Bioconductor biomaRt package. The reason for this turned out that a few entries contained the message "Sequence unavailable" rather than a valid sequence (see [1] for my posting reporting this). After email discussions with a biomaRt author and looking into the BioMart code I think the class generating these entries likely is org.biomart.processors.sequence.Sequence, and its done() method in particular [2].
I currently can't give an account of how attempts to retrieve a "coding" sequence for genes that don't have one yield a result containing the "unavailable" message. However, if my tentative analysis is correct, my suggestion would be to not return any result, rather than one containing a message instead of a sequence.
[1] https://stat.ethz.ch/pipermail/bioconductor/2014-June/060269.html [2]
Reply to this email directly or view it on GitHub https://github.com/biomart/BioMart/issues/1.
Reply to this email directly or view it on GitHub: https://github.com/biomart/BioMart/issues/1#issuecomment-47742005
+- Jan T. Kim -------------------------------------------------------+ | email: jttkim@gmail.com | | WWW: http://www.jtkim.dreamhosters.com/ | -----=< hierarchical systems are for files, not for humans >=-----
+- Jan T. Kim -------------------------------------------------------+ | email: jttkim@gmail.com | | WWW: http://www.jtkim.dreamhosters.com/ | -----=< hierarchical systems are for files, not for humans >=-----
Some code of mine recently ran into trouble while processing sequences I retrieved from biomart using the Bioconductor biomaRt package. The reason for this turned out that a few entries contained the message "Sequence unavailable" rather than a valid sequence (see [1] for my posting reporting this). After email discussions with a biomaRt author and looking into the BioMart code I think the class generating these entries likely is
org.biomart.processors.sequence.Sequence
, and itsdone()
method in particular [2].I currently can't give an account of how attempts to retrieve a "
coding
" sequence for genes that don't have one yield a result containing the "unavailable" message. However, if my tentative analysis is correct, my suggestion would be to not return any result, rather than one containing a message instead of a sequence.[1] https://stat.ethz.ch/pipermail/bioconductor/2014-June/060269.html [2] https://github.com/biomart/BioMart/blob/master/plugins/sequence/src/org/biomart/processors/sequence/Sequence.java , line 275