LSSTDESC / rail_base

Base classes for RAIL
MIT License
0 stars 1 forks source link

Using the new tables_io functions for the parquet file iterator #107

Closed joselotl closed 2 months ago

joselotl commented 2 months ago

Problem & Solution Description (including issue #)

Code Quality

joselotl commented 2 months ago

@sidneymau If you have time, maybe you could try to run directly one estimator using directly your parquet files to see if this fix works.

eacharles commented 2 months ago

I think that tables_io.io.getInputDataLength just decides function to call based on the type of data. I.e. it calls tables_io.io.getInputDataLengthPq or Hdf5, or whatever.On May 1, 2024, at 1:58 AM, hangqianjun @.***> wrote: @hangqianjun commented on this pull request.

In src/rail/core/data.py:

@@ -367,6 +367,9 @@ class PqHandle(TableHandle):

 suffix = "pq"

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because your review was requested.Message ID: @.***>

hangqianjun commented 2 months ago

I think that tables_io.io.getInputDataLength just decides function to call based on the type of data. I.e. it calls tables_io.io.getInputDataLengthPq or Hdf5, or whatever.

Yes exactly. I thought getInputDataLength already has the functionality so we can use it instead of getInputDataLengthPq. However it's no big issue, as the code still works! It just made me go look up the difference between the two when I saw this, and that's why I raised the question! Happy to approve the PR.

eacharles commented 2 months ago

So, to be clear, if you know what type of data you have you should probably call the appropriate function directly.   But if you aren’t sure, then you should call the generic function.-eOn May 1, 2024, at 8:48 AM, hangqianjun @.***> wrote:

I think that tables_io.io.getInputDataLength just decides function to call based on the type of data. I.e. it calls tables_io.io.getInputDataLengthPq or Hdf5, or whatever.

Yes exactly. I thought getInputDataLength already has the functionality so we can use it instead of getInputDataLengthPq. However it's no big issue, as the code still works! It just made me go look up the difference between the two when I saw this, and that's why I raised the question! Happy to approve the PR.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because your review was requested.Message ID: @.***>

hangqianjun commented 2 months ago

So, to be clear, if you know what type of data you have you should probably call the appropriate function directly.   But if you aren’t sure, then you should call the generic function.-e

I see! Thanks for the explanation!