biocore / deblur

Deblur is a greedy deconvolution algorithm based on known read error profiles.
BSD 3-Clause "New" or "Revised" License
91 stars 41 forks source link

saving to json (when h5py isn't present) doesn't actually work #51

Closed wasade closed 8 years ago

wasade commented 8 years ago

Strongly recommend removing biom 1.x format support and requiring h5py.

ElDeveloper commented 8 years ago

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald notifications@github.com wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

wasade commented 8 years ago

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" notifications@github.com wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald notifications@github.com wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241169795, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

wasade commented 8 years ago

What's the benefit of supporting both? H5py is trivial to install with conda and pip

On Aug 19, 2016 18:37, "Daniel T. McDonald" Daniel.Mcdonald@colorado.edu wrote:

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" notifications@github.com wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald notifications@github.com wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241169795, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

amnona commented 8 years ago

Some people like to work with txt tables (in r/stass/excel etc.) and i personally know someone who had trouble installing hdf5 on their server :) I think leaving the option to save as text does not cost us much, and may help other people... (and if it is not working - i'll try to make it work)

On Fri, Aug 19, 2016 at 6:39 PM, Daniel McDonald notifications@github.com wrote:

What's the benefit of supporting both? H5py is trivial to install with conda and pip

On Aug 19, 2016 18:37, "Daniel T. McDonald" Daniel.Mcdonald@colorado.edu wrote:

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" <notifications@github.com

wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald < notifications@github.com> wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241169795, or mute the thread https://github.com/notifications/unsubscribe- auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241170603, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8skcOYztOvA9RyTIYKwsFzZSX9WLks5qhlrvgaJpZM4Jo8ck .

wasade commented 8 years ago

Excel doesn't read json as far as I know

I don't see why it is important to carry maintenance and development burden in order to support the < 0.1% case which can be handled with normal user support

On Aug 19, 2016 19:48, "amnona" notifications@github.com wrote:

Some people like to work with txt tables (in r/stass/excel etc.) and i personally know someone who had trouble installing hdf5 on their server :) I think leaving the option to save as text does not cost us much, and may help other people... (and if it is not working - i'll try to make it work)

On Fri, Aug 19, 2016 at 6:39 PM, Daniel McDonald <notifications@github.com

wrote:

What's the benefit of supporting both? H5py is trivial to install with conda and pip

On Aug 19, 2016 18:37, "Daniel T. McDonald" < Daniel.Mcdonald@colorado.edu> wrote:

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" < notifications@github.com

wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald < notifications@github.com> wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241169795, or mute the thread https://github.com/notifications/unsubscribe- auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241170603, or mute the thread https://github.com/notifications/unsubscribe-auth/ AFkA8skcOYztOvA9RyTIYKwsFzZSX9WLks5qhlrvgaJpZM4Jo8ck .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241174444, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8slAtN1Uqsl6vK4i-2_2VlGAl1brrks5qhmr1gaJpZM4Jo8ck .

wasade commented 8 years ago

Can keep but please verify functionality. Tests were not passing until h5py was present

On Aug 19, 2016 19:56, "Daniel T. McDonald" Daniel.Mcdonald@colorado.edu wrote:

Excel doesn't read json as far as I know

I don't see why it is important to carry maintenance and development burden in order to support the < 0.1% case which can be handled with normal user support

On Aug 19, 2016 19:48, "amnona" notifications@github.com wrote:

Some people like to work with txt tables (in r/stass/excel etc.) and i personally know someone who had trouble installing hdf5 on their server :) I think leaving the option to save as text does not cost us much, and may help other people... (and if it is not working - i'll try to make it work)

On Fri, Aug 19, 2016 at 6:39 PM, Daniel McDonald < notifications@github.com> wrote:

What's the benefit of supporting both? H5py is trivial to install with conda and pip

On Aug 19, 2016 18:37, "Daniel T. McDonald" < Daniel.Mcdonald@colorado.edu> wrote:

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" < notifications@github.com

wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald < notifications@github.com> wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/biocore/deblur/issues/51#issuecomment-241169795 , or mute the thread https://github.com/notifications/unsubscribe- auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241170603, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8skcO YztOvA9RyTIYKwsFzZSX9WLks5qhlrvgaJpZM4Jo8ck .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241174444, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8slAtN1Uqsl6vK4i-2_2VlGAl1brrks5qhmr1gaJpZM4Jo8ck .

ElDeveloper commented 8 years ago

Mainly convenience (installation of hdf5/h5py), the biom-format package already supports both formats, so the burden for this package is minimal.

On (Aug-19-16|18:39), Daniel McDonald wrote:

What's the benefit of supporting both? H5py is trivial to install with conda and pip

On Aug 19, 2016 18:37, "Daniel T. McDonald" Daniel.Mcdonald@colorado.edu wrote:

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" notifications@github.com wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald notifications@github.com wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241169795, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/biocore/deblur/issues/51#issuecomment-241170603

wasade commented 8 years ago

Makes it more difficult to kill off old formats though.

On Aug 20, 2016 16:32, "Yoshiki Vázquez Baeza" notifications@github.com wrote:

Mainly convenience (installation of hdf5/h5py), the biom-format package already supports both formats, so the burden for this package is minimal.

On (Aug-19-16|18:39), Daniel McDonald wrote:

What's the benefit of supporting both? H5py is trivial to install with conda and pip

On Aug 19, 2016 18:37, "Daniel T. McDonald" < Daniel.Mcdonald@colorado.edu> wrote:

QIIME writes them out correctly

On Aug 19, 2016 18:26, "Yoshiki Vázquez Baeza" < notifications@github.com> wrote:

@mcdonadt, why? In QIIME we allow for both formats and I haven't seen any trouble with that.

Yoshiki Vázquez-Baeza

On Aug 19, 2016, at 5:03 PM, Daniel McDonald < notifications@github.com> wrote:

Strongly recommend removing biom 1.x format support and requiring h5py.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241169795, or mute the thread https://github.com/notifications/unsubscribe- auth/AAc8sjl-fKqm7gVlQ3OA-RhVjLN919poks5qhlfIgaJpZM4Jo8ck .

You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/biocore/deblur/issues/51#issuecomment-241170603

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241230016, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8sll8IKvFx4Dezuy2rGx7dC_jz_K8ks5qh46mgaJpZM4Jo8ck .

wasade commented 8 years ago

Any issues here are alleviated by adding h5py to conda_requirements.txt. Can we do that and drop 1.0.0 support?

mortonjt commented 8 years ago

Yeah - now that hdf5 and h5py are conda installable, this shouldn't be an issue. I'm down for dropping the old formats, considering that converting between them isn't too bad.

ElDeveloper commented 8 years ago

If the plan is to stop supporting the JSON-based format at some point, then this makes sense.

On (Aug-22-16|17:03), Daniel McDonald wrote:

Any issues here are alleviated by adding h5py to conda_requirements.txt. Can we do that and drop 1.0.0 support?

You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/biocore/deblur/issues/51#issuecomment-241588276

wasade commented 8 years ago

That is my hope, and it won't die if biocore continues to foster its use.

Current plan with q2 is support reading in order to handle legacy q1 data, but writing will be biom-format 2.1. The JSON representation hasn't scaled to the sized datasets in use for a few years now. If you haven't looked at biom.Table.to/from_json, I encourage you to take a peak at the terrible things done there in order to allow that representation to scale. (although, in general the biom io subsystem sucks but the larger effort is on delay for skbio)

ElDeveloper commented 8 years ago

OK, in that case it makes sense to drop support for 1.0, 👍

amnona commented 8 years ago

Agree about removing json How about writing to .txt if no hdf5? (We should also add the .txt output option as a flag for users who like to parse txt table....)

On Mon, Aug 22, 2016 at 5:42 PM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

OK, in that case it makes sense to drop support for 1.0, 👍

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241594169, or mute the thread https://github.com/notifications/unsubscribe-auth/AFkA8rJXiSRUdmZAdc9w7Q_AwG_RKnIVks5qikH8gaJpZM4Jo8ck .

mortonjt commented 8 years ago

That isn't critical, since that is already handled by biom explicitly out of the box

http://biom-format.org/documentation/biom_conversion.html

If necessary, we could add docs to the README to alert users how to convert their biom tables to text files.

On Mon, Aug 22, 2016 at 5:48 PM, amnona notifications@github.com wrote:

Agree about removing json How about writing to .txt if no hdf5? (We should also add the .txt output option as a flag for users who like to parse txt table....)

On Mon, Aug 22, 2016 at 5:42 PM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

OK, in that case it makes sense to drop support for 1.0, 👍

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241594169, or mute the thread https://github.com/notifications/unsubscribe- auth/AFkA8rJXiSRUdmZAdc9w7Q_AwG_RKnIVks5qikH8gaJpZM4Jo8ck .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241595046, or mute the thread https://github.com/notifications/unsubscribe-auth/AD_a3bZG38L_uzpOrh7NBBDxB9RmWAS7ks5qikNNgaJpZM4Jo8ck .

wasade commented 8 years ago

we can just require h5py then we don't have an issue of hdf5 not being present

On Mon, Aug 22, 2016 at 5:48 PM, amnona notifications@github.com wrote:

Agree about removing json How about writing to .txt if no hdf5? (We should also add the .txt output option as a flag for users who like to parse txt table....)

On Mon, Aug 22, 2016 at 5:42 PM, Yoshiki Vázquez Baeza < notifications@github.com> wrote:

OK, in that case it makes sense to drop support for 1.0, 👍

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241594169, or mute the thread https://github.com/notifications/unsubscribe- auth/AFkA8rJXiSRUdmZAdc9w7Q_AwG_RKnIVks5qikH8gaJpZM4Jo8ck .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/biocore/deblur/issues/51#issuecomment-241595046, or mute the thread https://github.com/notifications/unsubscribe-auth/AAc8sviELwyD_LNb7EilFH4Rsx0eSKx2ks5qikNMgaJpZM4Jo8ck .