aidenlab / juicer

A One-Click System for Analyzing Loop-Resolution Hi-C Experiments
http://aidenlab.org
MIT License
401 stars 180 forks source link

AWK bug rounding in script #241

Open sa501428 opened 2 years ago

sa501428 commented 2 years ago

On certain systems (e.g. Google Colab), awk will round the values of longs / use scientific notation. This results in the index_by_chr script returning an unusable file.

moshe-olshansky commented 2 years ago

Do you mean that 100000 is represented as 1.0E+5? This also happens in R but there you can discourage such behavior. Possibly there exists an awk flag for this. Does it happen with gawk too?

nchernia commented 2 years ago

You can use printf to avoid this as well.

On Sat, Sep 25, 2021 at 2:17 AM moshe-olshansky @.***> wrote:

Do you mean that 100000 is represented as 1.0E+5? This also happens in R but there you can discourage such behavior. Possibly there exists an awk flag for this. Does it happen with gawk too?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/aidenlab/juicer/issues/241#issuecomment-927018616, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAK2EW575222DDOQMQTD65TUDVSQPANCNFSM5EXALKIQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Neva Cherniavsky Durand, Ph.D. | she, her, hers Senior Scientist | Gene Regulation Observatory Broad Institute of MIT and Harvard