firasmidani / amiga

Analysis of Microbial Growth Assays
https://firasmidani.github.io/amiga/
GNU General Public License v3.0
12 stars 3 forks source link

Index contains duplicate entries, cannot reshape #2

Closed aassie closed 3 years ago

aassie commented 3 years ago

Hello Firas,

Thank you for developing this nice tool.

If I run basic commands the tool appears to work, it generated nice figures for our plates and a summary table.

However, when I try to pool my replicates, I have the following error:

python ~/Documents/Tools/amiga/amiga.py -i ./Run1/ -o "pooled_analysis" --pool-by "Isolate"

#-----------------------------------------------#
# AMiGA is peeking inside the working directory #
#-----------------------------------------------#

#-------------------------------------------------------------#
# AMiGA is parsing command-line arguments and parameter files #
#-------------------------------------------------------------#

#------------------------------------------#
# AMiGA is parsing and cleaning data files #
#------------------------------------------#

#--------------------------------------------#
# AMiGA is parsing and reading mapping files #
#--------------------------------------------#

#----------------------------------------------------------#
# AMiGA is preparing or analyzing data based on user input #
#----------------------------------------------------------#

#--------------------------------#
# AMiGA is fitting growth curves #
#--------------------------------#

Traceback (most recent call last):
  File "/Users/adrien/Documents/Tools/amiga/amiga.py", line 75, in <module>
    runGrowthFitting(data,mappings,directory,args,verbose=args['verbose'])
  File "/Users/adrien/Documents/Tools/amiga/libs/analyze.py", line 137, in runGrowthFitting
    runCombinedGrowthFitting(data,mapping,directory,args,verbose=verbose)
  File "/Users/adrien/Documents/Tools/amiga/libs/analyze.py", line 285, in runCombinedGrowthFitting
    gm = GrowthModel(df=cond_data,ARD=True,heteroscedastic=fix_noise,nthin=nthin)#,
  File "/Users/adrien/Documents/Tools/amiga/libs/model.py", line 104, in __init__
    sub_df = describeVariance(sub_df,time='Time',od='OD')
  File "/Users/adrien/Documents/Tools/amiga/libs/model.py", line 52, in describeVariance
    tmp = pd.pivot(df,index=time,columns='SID',values=od)
  File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/pivot.py", line 430, in pivot
    return indexed.unstack(columns)
  File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/series.py", line 3748, in unstack
    return unstack(self, level, fill_value)
  File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 418, in unstack
    unstacker = _Unstacker(
  File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 142, in __init__
    self._make_selectors()
  File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 180, in _make_selectors
    raise ValueError("Index contains duplicate entries, " "cannot reshape")
ValueError: Index contains duplicate entries, cannot reshape

Any idea what causes it?

This is how my metadata files look like:

head Run1/mapping/*.txt 
==> Run1/mapping/LB_1.txt <==
    Isolate Group   Control Replicate
A1  mut1    1   0   1
A2  mut2    1   0   1

==> Run1/mapping/LB_2.txt <==
    Isolate Group   Control Replicate
A1  mut1    1   0   2
A2  mut2    1   0   2

==> Run1/mapping/M9_1.txt <==
    Isolate Group   Control Replicate
A1  mut1    2   0   1
A2  mut2    2   0   1

==> Run1/mapping/M9_2.txt <==
    Isolate Group   Control Replicate
A1  mut1    2   0   2
A2  mut2    2   0   2

Thank you for your help,

Adrien

firasmidani commented 3 years ago

Hi Adrien,

Did you create your own metadata files?

If yes, what happens if you include a "Plate_ID" column for each mapping file (as I show for example below)?

==> Run1/mapping/M9_2.txt <== Plate_ID Isolate Group Control Replicate A1 M9_2 mut1 2 0 2 A2 M9_2 mut2 2 0 2

Firas

On Fri, Feb 12, 2021 at 9:18 AM Adrien Assie notifications@github.com wrote:

Hello Firas,

Thank you for developing this nice tool.

If I run basic commands the tool appears to work, it generated nice figures for our plates and a summary table.

However, when I try to pool my replicates, I have the following error:

python ~/Documents/Tools/amiga/amiga.py -i ./Run1/ -o "pooled_analysis" --pool-by "Isolate"

-----------------------------------------------

AMiGA is peeking inside the working directory

-----------------------------------------------

-------------------------------------------------------------

AMiGA is parsing command-line arguments and parameter files

-------------------------------------------------------------

------------------------------------------

AMiGA is parsing and cleaning data files

------------------------------------------

--------------------------------------------

AMiGA is parsing and reading mapping files

--------------------------------------------

----------------------------------------------------------

AMiGA is preparing or analyzing data based on user input

----------------------------------------------------------

--------------------------------

AMiGA is fitting growth curves

--------------------------------

Traceback (most recent call last): File "/Users/adrien/Documents/Tools/amiga/amiga.py", line 75, in runGrowthFitting(data,mappings,directory,args,verbose=args['verbose']) File "/Users/adrien/Documents/Tools/amiga/libs/analyze.py", line 137, in runGrowthFitting runCombinedGrowthFitting(data,mapping,directory,args,verbose=verbose) File "/Users/adrien/Documents/Tools/amiga/libs/analyze.py", line 285, in runCombinedGrowthFitting gm = GrowthModel(df=cond_data,ARD=True,heteroscedastic=fix_noise,nthin=nthin)#, File "/Users/adrien/Documents/Tools/amiga/libs/model.py", line 104, in init sub_df = describeVariance(sub_df,time='Time',od='OD') File "/Users/adrien/Documents/Tools/amiga/libs/model.py", line 52, in describeVariance tmp = pd.pivot(df,index=time,columns='SID',values=od) File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/pivot.py", line 430, in pivot return indexed.unstack(columns) File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/series.py", line 3748, in unstack return unstack(self, level, fill_value) File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 418, in unstack unstacker = _Unstacker( File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 142, in init self._make_selectors() File "/Users/adrien/anaconda3/envs/amiga/lib/python3.8/site-packages/pandas/core/reshape/reshape.py", line 180, in _make_selectors raise ValueError("Index contains duplicate entries, " "cannot reshape") ValueError: Index contains duplicate entries, cannot reshape

Any idea what causes it?

This is how my metadata files look like:

head Run1/mapping/*.txt ==> Run1/mapping/LB_1.txt <== Isolate Group Control Replicate A1 mut1 1 0 1 A2 mut2 1 0 1

==> Run1/mapping/LB_2.txt <== Isolate Group Control Replicate A1 mut1 1 0 2 A2 mut2 1 0 2

==> Run1/mapping/M9_1.txt <== Isolate Group Control Replicate A1 mut1 2 0 1 A2 mut2 2 0 1

==> Run1/mapping/M9_2.txt <== Isolate Group Control Replicate A1 mut1 2 0 2 A2 mut2 2 0 2

Thank you for your help,

Adrien

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/firasmidani/amiga/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2KVNJDFPGSO45BEXOWSQDS6VPE5ANCNFSM4XRBBT3Q .

aassie commented 3 years ago

Ah yes, it worked thank you very much!