cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.22k stars 3.59k forks source link

perf: x/genutil/types: AppGenesisFromFile and AppGenesisFromReader are unnecessarily inefficient by fully slurping the io.Reader to RAM then invoking json.Unmarshal then attempting on error to invoke cometbft/cmtjson.Unmarshal #19269

Closed odeke-em closed 3 weeks ago

odeke-em commented 8 months ago

Is there an existing issue for this?

What happened?

While auditing cometbft to evaluate how we can use an incremental JSON parser instead of always invoking cometbft/libs/json.Unmarshal(bytes...) which requires reading an entire Genesis file into RAM, I noticed this code https://github.com/cosmos/cosmos-sdk/blob/430b53236ef6fb775fe1eb753c0154b17ba419ac/x/genutil/types/genesis.go#L91-L101

The need to invoke cmtjson.Unmarshal as an alternative seems to pigeonhole the need to slurp in all the bytes

Unnecessary invocation of bufio.NewReader

This code is unnecessary to pass in bufio.NewReader(*os.File) yet immediately just invoking io.ReadAll, to read the entire file into RAM

Prognosis

These problems are from pigeon holing and over-generalizations because: a) The use of AppGenesisFromFile is in server/start.go and it opens an *os.File which implements io.ReadSeekCloser which can help us in the case that the encoding/json parsing would work and perhaps not having known about io.ReadSeeker might have constrained the thinking and implementation but even bytes.NewReader implements

diff --git a/x/genutil/types/genesis.go b/x/genutil/types/genesis.go
index b5b335ca1..8bbbbfa9d 100644
--- a/x/genutil/types/genesis.go
+++ b/x/genutil/types/genesis.go
@@ -89,35 +89,72 @@ func (ag *AppGenesis) SaveAs(file string) error {

 // AppGenesisFromReader reads the AppGenesis from the reader.
 func AppGenesisFromReader(reader io.Reader) (*AppGenesis, error) {
+   // If the reader already implements io.ReadSeeker, just use it.
+   if rs, ok := reader.(io.ReadSeeker); ok {
+       return fAppGenesis(rs)
+   }
+
+   // Otherwise compose for it the io.ReadSeeker using bytes.NewReader
    jsonBlob, err := io.ReadAll(reader)
    if err != nil {
        return nil, err
    }
+   return fAppGenesis(bytes.NewReader(jsonBlob))
+}

-   var appGenesis AppGenesis
-   if err := json.Unmarshal(jsonBlob, &appGenesis); err != nil {
-       // fallback to CometBFT genesis
-       var ctmGenesis cmttypes.GenesisDoc
-       if err2 := cmtjson.Unmarshal(jsonBlob, &ctmGenesis); err2 != nil {
-           return nil, fmt.Errorf("error unmarshalling AppGenesis: %w\n failed fallback to CometBFT GenDoc: %w", err, err2)
-       }
+// fAppGenesis takes in an io.ReadSeeker and firstly attempts to JSON parse
+// it using the Go standard library's encoding/json, and if that fails, falls
+// back to using cometbft genesis parsing with cmtjson.Unmarshal.
+func fAppGenesis(rs io.ReadSeeker) (*AppGenesis, error) {
+   appGenesis := new(AppGenesis)
+   dec := json.NewDecoder(rs)
+   err := dec.Decode(appGenesis)
+   if err == nil {
+       return appGenesis, nil
+   }

-       appGenesis = AppGenesis{
-           AppName: version.AppName,
-           // AppVersion is not filled as we do not know it from a CometBFT genesis
-           GenesisTime:   ctmGenesis.GenesisTime,
-           ChainID:       ctmGenesis.ChainID,
-           InitialHeight: ctmGenesis.InitialHeight,
-           AppHash:       ctmGenesis.AppHash,
-           AppState:      ctmGenesis.AppState,
-           Consensus: &ConsensusGenesis{
-               Validators: ctmGenesis.Validators,
-               Params:     ctmGenesis.ConsensusParams,
-           },
-       }
+   // Otherwise fallback to CometBFT genesis parsing.
+   // Rewind the reader to the front.
+   if _, serr := rs.Seek(0, io.SeekStart); serr != nil {
+       return nil, fmt.Errorf("error seeking back to the front: %w\n had an error before: %w", serr, err)
+   }
+
+   // TODO: Once cmtjson implements the incremental JSON parser we shall need
+   // to replace this code to avoid pigeon-holing the implementation to slurp in all the bytes.
+   jsonBlob, jerr := io.ReadAll(rs)
+   if jerr != nil {
+       return nil, jerr
+   }
+
+   appGenesis, err2 := appGenesisFromCometBFT(jsonBlob)
+   if err2 != nil {
+       return nil, fmt.Errorf("error unmarshalling AppGenesis: %w\n failed fallback to CometBFT GenDoc: %w", err, err2)
    }

-   return &appGenesis, nil
+   return appGenesis, nil
+}
+
+func appGenesisFromCometBFT(jsonBlob []byte) (*AppGenesis, error) {
+   ctmGenesis := new(cmttypes.GenesisDoc)
+   // TODO: Once cmtjson implements the incremental JSON parser we shall need
+   // to replace this code to avoid pigeon-holing the implementation to slurp in all the bytes.
+   if err2 := cmtjson.Unmarshal(jsonBlob, ctmGenesis); err2 != nil {
+       return nil, fmt.Errorf("error unmarshalling AppGenesis: %w\n failed fallback to CometBFT GenDoc: %w", err, err2)
+   }
+
+   return &AppGenesis{
+       AppName: version.AppName,
+       // AppVersion is not filled as we do not know it from a CometBFT genesis
+       GenesisTime:   ctmGenesis.GenesisTime,
+       ChainID:       ctmGenesis.ChainID,
+       InitialHeight: ctmGenesis.InitialHeight,
+       AppHash:       ctmGenesis.AppHash,
+       AppState:      ctmGenesis.AppState,
+       Consensus: &ConsensusGenesis{
+           Validators: ctmGenesis.Validators,
+           Params:     ctmGenesis.ConsensusParams,
+       },
+   }, nil
 }

 // AppGenesisFromFile reads the AppGenesis from the provided file.
@@ -127,13 +164,15 @@ func AppGenesisFromFile(genFile string) (*AppGenesis, error) {
        return nil, err
    }

-   appGenesis, err := AppGenesisFromReader(bufio.NewReader(file))
+   appGenesis, err := AppGenesisFromReader(file)
+   ferr := file.Close()
+
    if err != nil {
        return nil, fmt.Errorf("failed to read genesis from file %s: %w", genFile, err)
    }

-   if err := file.Close(); err != nil {
-       return nil, err
+   if ferr != nil {
+       return nil, ferr
    }

    return appGenesis, nil

Cosmos SDK Version

main

How to reproduce?

Please see my detailed report.

/cc @elias-orijtech @ValarDragon

tac0turtle commented 8 months ago

want to make a pr?

psiphi5 commented 1 month ago

want to make a pr?

Hi, I'm interested in picking up this issue, would it just involve making the changes to x/genutil/types/genesis.go as described above?

julienrbrt commented 1 month ago

want to make a pr?

Hi, I'm interested in picking up this issue, would it just involve making the changes to x/genutil/types/genesis.go as described above?

Essentially yes, thank you!